What is a joint probability distribution?


A two-way frequency table is a table that displays the frequencies (or “counts”) for two categorical variables.

For example, the following two-way table shows the results of a survey that asked 100 people which sport they liked best: baseball, basketball, or football.

The rows display the gender of the respondent and the columns show which sport they chose:

In this example, there are two variables: Sports and Gender.

A joint probability distribution simply describes the probability that a given individual takes on two specific values for the variables.

The word “joint” comes from the fact that we’re interested in the probability of two things happening at once.

For example, out of the 100 total individuals there were 13 who were male and chose baseball as their favorite sport.

Thus, we would say the joint probability that a given individual is male and chooses baseball as their favorite sport is 13/100 = 0.13 or 13%.

Written in mathematical notation:

P(Gender = Male, Sport = Baseball) = 13/100 = 0.13.

We can use this process to calculate the entire joint probability distribution:

  • P(Gender = Male, Sport = Baseball) = 13/100 = 0.13
  • P(Gender = Male, Sport = Basketball) = 15/100 = 0.15
  • P(Gender = Male, Sport = Football) = 20/100 = 0.20
  • P(Gender = Female, Sport = Baseball) = 23/100 = 0.23
  • P(Gender = Female, Sport = Basketball) = 16/100 = 0.16
  • P(Gender = Female, Sport = Football) = 13/100 = 0.13

Notice that the sum of the probabilities is equal to 1, or 100%.

Why Use a Joint Probability Distribution?

Joint probability distributions are useful because we often collect data for two variables (like Sports and Gender) and we’re interested in answering questions related to both variables.

For example, we might want to understand how likely it is that a given individual in a is male and prefers baseball as their favorite sport.

A joint probability distribution can help us answer these questions.

Use the following examples as practice for gaining a better understanding of joint probability distributions.

Example 1

The following two-way table shows the results of a survey that asked 238 people which movie genre they liked best:

Marginal distribution example with two-way table

Question: What is the probability that a given individual is female and prefers Drama as their favorite movie genre?

Answer: P(Gender = Female, Genre = Drama) = 58/238 = 0.24424.4%

Example 2

The following two-way table shows the exam scores of 64 students in a class based on how many hours they spent studying:

Marginal distribution example

Question: What is the probability that a given individual studies for 2 hours and receives a score between 91 and 100?

Answer: P(Study = 2 hours, Score = 91-100) = 3/64 = 0.047 = 4.7%

x