What is Sampling Variability?


Often in statistics we’re interested in answering questions like:

  • What is the mean household income in a certain state?
  • What is the mean weight of a certain species of turtle?
  • What is the mean attendance at college football games?

In each scenario, we are interested in answering some question about a population, which represents every possible individual element that we’re interested in measuring.

However, instead of collecting data on every individual in a population we instead collect data on a sample of the population, which represents a portion of the total population.

For example, we might want to know the mean weight of a certain species of turtle that has a total population of 800 turtles.

Since it would take too long to locate and weigh every single turtle in the population, we instead collect a of 30 turtles and weigh them:

Sample mean example

We could then use the mean weight of this sample of turtles to estimate the mean weight of all turtles in the population.

Sampling variability refers to the fact that the mean will vary from one sample to the next.

For example, in one random sample of 30 turtles the sample mean may turn out to be 350 pounds. In another random sample, the sample mean may be 345 pounds. In yet another sample, the sample mean may be 355 pounds.

There is variability among the sample means.

How to Measure Sampling Variability

In practice, we only collect one sample to estimate a population parameter. For example, we will only collect one sample of 30 sea turtles to estimate the mean weight for the entire population of turtles.

This means we’ll only calculate one sample mean (x) and use it to estimate the population mean (μ).

Sample Mean = x

But we know that the sample mean will vary from one sample to the next. So, to account for this variability we can use the following formula to estimate the standard deviation of the sample mean:

where:

  • s: The sample standard deviation
  • n: The sample size

For example, suppose we collect a sample of 30 sea turtles and find that the sample mean weight is 350 pounds and the sample standard deviation is 12 pounds. Based on these numbers, we would calculate:

Sample Mean = 350 pounds

Standard Deviation of Sample Mean = 12 / √30 = 2.19 pounds

This means that our best estimate for the true population mean weight of all turtles is 350 pounds, but that we should expect the mean from one sample to the next to vary with a standard deviation of about 2.19 pounds.

One interesting property of the standard deviation of the sample mean is that it naturally becomes smaller as we use larger and larger sample sizes.

For example, suppose we collect a sample of 100 sea turtles and find that the sample mean weight is 350 pounds and the sample standard deviation is 12 pounds. The standard deviation of the sample mean would then be calculated as:

Standard Deviation of Sample Mean = 12 / √100 = 1.2 pounds

Our best estimate for the sample mean would still be 350 pounds, but we can expect the mean from one sample of 100 sea turtles to the next sample of 100 sea turtles to vary with a standard deviation of just 1.2 pounds.

In other words, there is less variability among sample means when the sample sizes are larger.

x