What is Categorical Distribution?


A categorical distribution is a discrete probability distribution that describes the probability that a will take on a value that belongs to one of K categories, where each category has a probability associated with it.

For a distribution to be classified as a categorical distribution, it must meet the following criteria:

  • The categories are discrete.
  • There are two or more potential categories.
  • The probability that the random variable takes on a value in each category must be between 0 and 1.
  • The sum of the probabilities for all categories must sum to 1.

The most obvious example of a categorical distribution is the distribution of outcomes associated with rolling a dice. There are K = 6 potential outcomes and the probability for each outcome is 1/6:

Example of categorical distribution

This distribution satisfies all of the criteria to be classified as a categorical distribution:

  • The categories are discrete (e.g. the random variable can only take on discrete values – 1, 2, 3, 4, 5, 6)
  • There are two or more potential categories.
  • The probability of each category is between 0 and 1.
  • The sum of the probabilities add up to 1: 1/6 + 1/6 + 1/6 + 1/6 + 1/6 + 1/6 = 1.

Rule of Thumb:

 

If you can count the number of outcomes, then you are working with a discrete random variable – e.g. counting the number of times a coin lands on heads.

 

But if you can measure the outcome, you are working with a continuous random variable – e.g. measuring height, weight, time, etc.

Other Examples of Categorical Distributions

There are plenty of categorical distributions in the real world, including:

Example 1: Flipping a Coin.

When we flip a coin there are 2 potential discrete outcomes, the probability of each outcome is between 0 and 1, and the sum of the probabilities is equal to 1:

Categorical distribution example

Example 2: Selecting Marbles from an Urn.

Suppose an urn contains 5 red marbles, 3 green marbles, and 2 purple marbles. If we randomly select one marble from the urn, there are 3 potential discrete outcomes, the probability of each outcome is between 0 and 1, and the sum of the probabilities is equal to 1:

Categorical distribution probabilities

Example 3: Selecting a Card from a Deck.

Relation to Other Distributions

For a distribution to be classified as a categorical distribution, it must have K ≥ 2 potential outcomes and n = 1 trial.

Using this terminology, a categorical distribution is similar to the following distributions:

Bernoulli distribution: K = 2 outcomes, n = 1 trial

Binomial distribution: K = 2 outcomes, n ≥ 1 trial

Multinomial distribution: K ≥ 2 outcomes, n ≥ trial

x