What are Confidence Intervals?

A confidence interval is a range of values that can be used to estimate a population parameter with a certain degree of certainty. It is calculated using a sample statistic and its associated standard error, and provides an interval of values within which the true population parameter is likely to fall. Confidence intervals provide a measure of the uncertainty associated with an estimate, and are typically expressed as a range of values at a given confidence level, such as 95% or 99%.


Often in statistics we’re interested in measuring  – numbers that describe some characteristic of an entire population.

Two of the most common population parameters are:

1. Population mean: the mean value of some variable in a population (e.g. the mean height of males in the U.S.)

2. Population proportion: the proportion of some variable in a population (e.g. the proportion of residents in a county who support a certain law)

Although we’re interested in measuring these parameters, it’s usually too costly and time-consuming to actually go around and collect data on every individual in a population in order to calculate the population parameter.

Instead, we typically take a random sample from the overall population and use data from the sample to estimate the population parameter.

For example, suppose we want to estimate the mean weight of a certain species of turtle in Florida. Since there are thousands of turtles in Florida, it would be extremely time-consuming and costly to go around and weigh each individual turtle.

Instead, we might take a of 50 turtles and use the mean weight of the turtles in this sample to estimate the true population mean:

Sample from population example

The problem is that the mean weight of turtles in the sample is not guaranteed to exactly match the mean weight of turtles in the whole population. For example, we might just happen to pick a sample full of low-weight turtles or perhaps a sample full of heavy turtles.

In order to capture this uncertainty we can create a confidence interval. A confidence interval is a range of values that is likely to contain a population parameter with a certain level of confidence. It is calculated using the following general formula:

Confidence Interval = (point estimate)  +/-  (critical value)*(standard error)

This formula creates an interval with a lower bound and an upper bound, which likely contains a population parameter with a certain level of confidence.

Confidence Interval  = [lower bound, upper bound]

For example, the formula to calculate a confidence interval for a population mean is as follows:

Confidence Interval = x  +/-  z*(s/√n)

where:

  • x: sample mean
  • z: the chosen z-value
  • s: sample standard deviation
  • n: sample size

The z-value that you will use is dependent on the confidence level that you choose. The following table shows the z-value that corresponds to popular confidence level choices:

Confidence Level z-value
0.90 1.645
0.95 1.96
0.99 2.58

For example, suppose we collect a random sample of turtles with the following information:

  • Sample size n = 25
  • Sample mean weight x = 300
  • Sample standard deviation s = 18.5

Here is how to find calculate the 90% confidence interval for the true population mean weight:

90% Confidence Interval: 300 +/-  1.645*(18.5/√25) = [293.91, 306.09]

We interpret this confidence interval as follows:

There is a 90% chance that the confidence interval of [293.91, 306.09] contains the true population mean weight of turtles.

Another way of saying the same thing is that there is only a 10% chance that the true population mean lies outside of the 90% confidence interval. That is, there’s only a 10% chance that the true population mean weight of turtles is greater than 306.09 pounds or less than 293.91 pounds.

It’s worth nothing that there are two numbers that can affect the size of a confidence interval:

1. The sample size: The larger the sample size, the more narrow the confidence interval. 

2. The confidence level: The larger the confidence level, the wider the confidence interval.

Types of Confidence Intervals

There are many types of confidence intervals. Here are the most commonly used ones:

Confidence Interval for a Mean

A confidence interval for a mean is a range of values that is likely to contain a population mean with a certain level of confidence. The formula to calculate this interval is:

Confidence Interval = x  +/-  z*(s/√n)

where:

  • x: sample mean
  • z: the chosen z-value
  • s: sample standard deviation
  • n: sample size

Resources:

Confidence Interval for the Difference Between Means

confidence interval (C.I.) for a difference between means is a range of values that is likely to contain the true difference between two population means with a certain level of confidence. The formula to calculate this interval is:

Confidence interval = (x1x2) +/- t*√((sp2/n1) + (sp2/n2))

where:

  • x1x2: sample 1 mean, sample 2 mean
  • t: the t-critical value based on the confidence level and (n1+n2-2) degrees of freedom
  • sp2: pooled variance
  • n1, n2: sample 1 size, sample 2 size

where:

  • The pooled variance is calculated as: sp2 = ((n1-1)s12 + (n2-1)s22) / (n1+n2-2)
  • The t-critical value can be found using the

Resources:

Confidence Interval for a Proportion

A confidence interval for a proportion is a range of values that is likely to contain a population proportion with a certain level of confidence. The formula to calculate this interval is:

Confidence Interval = p  +/-  z*(√p(1-p) / n)

where:

  • p: sample proportion
  • z: the chosen z-value
  • n: sample size

Resources:

Confidence Interval for the Difference in Proportions

A confidence interval for the difference in proportions is a range of values that is likely to contain the true difference between two population proportions with a certain level of confidence.. The formula to calculate this interval is:

Confidence interval = (p1–p2)  +/-  z*√(p1(1-p1)/n+ p2(1-p2)/n2)

where:

  • p1, p2: sample 1 proportion, sample 2 proportion
  • z: the z-critical value based on the confidence level
  • n1, n2: sample 1 size, sample 2 size

Resources:

x