Single Sample T-Test

The Single Sample T-Test is a statistical analysis method used to determine if the mean of a single sample is significantly different from a known or hypothesized population mean. This test compares the sample mean to the population mean and calculates the probability (p-value) of obtaining the observed difference by chance alone. This test is commonly used in research studies to assess the effectiveness of a treatment or intervention by comparing the mean of a sample to a known or expected value. It is a powerful tool in determining the significance of results and making informed decisions based on data.


What is a Single Sample T-Test?

The Single Sample T-Test is a statistical test used to determine if a single group is significantly different from a known or hypothesized population value on your variable of interest. Your variable of interest should be continuous and normally distributed and you should have enough data (more than 5 values).

A Single Sample T-Test is a statistical test comparing a bell shaped, normal distribution mean on the left, with a population mean on the right.

The Single Sample T-Test is also called a One-Sample T-Test, Single Sample Student T-Test, or One-Sample Test of Means.


Assumptions for a Single Sample T-Test

Every statistical method has assumptions. Assumptions mean that your data must satisfy certain properties in order for statistical method results to be accurate.

The assumptions for the Single Sample T-Test include:

  1. Continuous
  2. Normally Distributed
  3. Random Sample
  4. Enough Data

Let’s dive in to each one of these separately.

Continuous

The variable that you care about (and want to see if it is different between your group and the population) must be continuous. Continuous means that the variable can take on any reasonable value.

Some good examples of continuous variables include age, weight, height, test scores, survey scores, yearly salary, etc.

If the variable that you care about is a proportion (48% of males voted vs 56% of females voted) and you have more than 5 in each group then you should use the One-Proportion Z-Test. If your variable of interest is a proportion and you have less than 5 in a group, you should use the Exact Test of Goodness of Fit.

Normally Distributed Variable of Interest

The variable that you care about must be spread out in a normal way. In statistics, this is called being normally distributed (aka it must look like a bell curve when you graph the data). Only use a single sample t-test with your data if the variable you care about is normally distributed.

A normal distribution is bell shaped with most of the data in the middle as seen on the top of this image. A skewed distribution is leaning left or right with most of the data on the edge as seen on the bottom of this image.

If your variable is not normally distributed, you should use Single-Sample Wilcoxon Signed-Rank Test instead.

Random Sample

The data points for each group in your analysis must have come from a simple random sample. This means that if you wanted to see if drinking sugary soda makes you gain weight, you would need to randomly select a group of soda drinkers for your soda drinker group, and then you would compare that to a known population weight for non-sugary-soda drinkers.

The key here is that the data points for each group were randomly selected. This is important because if your group is not randomly determined then your analysis will be incorrect. In statistical terms this is called bias, or a tendency to have incorrect results because of bad data.

If you do not have a random sample, the conclusions you can draw from your results are very limited. You should try to get a simple random sample.If you have paired samples (2 measurements from the same group of subjects) then you should use a Paired Samples T-Test instead. If you want to compare 2 groups of subjects instead of a single group with a population mean, then you should use an Independent Samples T-Test instead

Enough Data

The sample size (or data set size) should be greater than 5 in your group. Some people argue for more than 15 or even 30, but more than 5 is probably sufficient.

It also depends on the expected size of the difference between groups. If you expect a large difference between groups, then you can get away with a smaller sample size. If you expect a small difference between groups, then you likely need a larger sample (30+).

The sample size needed in order to have statistically significant results for a single sample t-test. For a small effect size, 199 participants are needed, for a medium effect size, 34 participants are needed, and for a large effect size, 15 participants are needed.
*sample size calculation was conducted in G*Power with a power of 0.80, critical value (alpha) of 0.05, and 0.20, 0.50, and 0.80 used as the effect size values for small, medium, and large Cohen’s D effect sizes respectively

If your sample size is greater than 30 (and you know the average and standard deviation or spread of the population values), you should run a Single Sample Z-Test instead.


When to use a Single Sample T-Test?

You should use a Single Sample T-Test in the following scenario:

  1. You want to know if one group is different from a known or hypothesized population value on your variable of interest
  2. Your variable of interest is continuous
  3. You have one group
  4. Your variable of interest is normally distributed

Let’s clarify these to help you know when to use a Single Sample T-Test.

Difference

You are looking for a statistical test to see whether a single group is significantly different from a population value on your variable of interest. This is a difference question. Other types of analyses include examining the relationship between two variables (correlation) or predicting one variable using another variable (prediction).

Continuous Data

Your variable of interest must be continuous. Continuous means that your variable of interest can basically take on any value, such as heart rate, height, weight, number of ice cream bars you can eat in 1 minute, etc.

Types of data that are NOT continuous include ordered data (such as finishing place in a race, best business rankings, etc.), categorical data (gender, eye color, race, etc.), or binary data (purchased the product or not, has the disease or not, etc.).

One Group

A Single Sample T-Test can only be used to compare a single group with a known population value on your variable of interest.

If you have three or more groups, you should use a One Way Anova analysis instead. If you have two groups to compare, you should use an Independent Samples T-Test instead.

Normally Distributed

Normally distributed was covered earlier and means that your variable of interest should look like a bell curve when you graph it as a histogram.

If you get a group of students to take a pre-test and the same students to take a post-test, you have two different variables for the same group of students, which would be paired data, in which case you would need to use a Paired Samples T-Test instead.


Single Sample T-Test Example

Group 1: Received the experimental medical treatment.
Population Value: On average in the population, it takes 12 days to recover from the disease
Variable of interest: Time to recover from the disease in days.

In this example, group 1 is our treatment group because they received the experimental medical treatment. The population value is essentially our control group because they did not receive the treatment.

The null hypothesis, which is statistical lingo for what would happen if the treatment does nothing, is that group 1 and our population will recover from the disease in about the same number of days, on average. We are trying to determine if receiving the experimental medical treatment will shorten the number of days it takes for patients to recover from the disease.

As we run the experiment, we track how long it takes for each patient to fully recover from the disease. In order to use a Single Sample T-Test on our data, our variable of interest has to be normally distributed (bell curve shaped). In this case, recovery from the disease in days is normal for our treatment group.

After the experiment is over, we compare our treatment group to the population value on our variable of interest (days to fully recover) using a Single Sample T-Test. When we run the analysis, we get a t-statistic and a p-value. The t-statistic is a measure of how different our group is from the population value on our recovery variable of interest. A p-value is the chance of seeing our results assuming the treatment actually doesn’t do anything. A p-value less than or equal to 0.05 means that our result is statistically significant and we can trust that the difference is not due to chance alone.

x