One-Proportion Z-Test

The One-Proportion Z-Test is a statistical method used to determine whether the proportion of a specific characteristic in a population is significantly different from a hypothesized value. This test is based on the assumption that the proportion follows a normal distribution and uses a Z-test statistic to evaluate the significance of the observed difference. It is commonly used in research and studies to compare proportions, such as in medical trials or surveys. The results of the One-Proportion Z-Test can help to make informed decisions and draw conclusions about the population being studied.


What is the One-Proportion Z-Test?

The One-Proportion Z-Test is a statistical test used to determine if the proportions of categories in a single qualitative variable significantly differ from an expected or known population proportion. To use it, you should have one group variable with only two options and you should have more than 10 values in every cell. See more below.

The One-Proportion Z-Test is used to determine if the proportions of categories in a single qualitative variable differ from an expected proportion.

The One-Proportion Z-Test is also called the One Sample Proportion Test.


Assumptions for the One-Proportion Z-Test

Every statistical method has assumptions. Assumptions mean that your data must satisfy certain properties in order for statistical method results to be accurate.

The assumptions for the One-Proportion Z-Test include:

  1. Random Sample
  2. Independence
  3. Large Population
  4. Mutually Exclusive

Let’s dive into what that means.

Random Sample

The data points for each group in your analysis must have come from a simple random sample. This means that if you want to know if your sample of people has a different ratio of Male/Female than the population, then the sample should be randomly selected. This is important because if your groups were not randomly determined then your analysis will be incorrect. In statistical terms this is called bias, or a tendency to have incorrect results because of bad data.

Independence

Each of your observations (data points) should be independent. This means that each value of your variables doesn’t “depend” on any of the others. For example, this assumption is usually violated when there are multiple data points over time from the same unit of observation (e.g. subject/customer/store), because the data points from the same unit of observation are likely to be related or affect one another.

Large Population

The population of interest (where you pulled your sample) should be at least 10 times greater than your sample. So if you randomly sampled American men and women, you would want to sample fewer than tens of millions of people (which probably won’t be a problem).

Mutually Exclusive

No subject or participant should be included under both conditions. Each row in your data should only be included in a single group.


When to use the One-Proportion Z-Test?

You should use the One-Proportion Z-Test in the following scenario:

  1. You want to know the difference between two variables
  2. Your variable of interest is proportional or categorical
  3. You have only two options
  4. You have more than 10 in each cell

Let’s clarify these to help you know when to use the One-Proportion Z-Test.

Difference

You are looking for a statistical test to look at how a variable differs between two groups. Other types of analyses include testing for a relationship between two variables or predicting one variable using another variable (prediction).

Proportional or Categorical

For this test, your variable of interest must be proportional or categorical. A categorical variable is a variable that contains categories without a natural order. Examples of categorical variables are eye color, city of residence, type of dog, etc. Proportional variables are derived from categorical variables, for instance: the number of people that converted on two different versions of your website (10% vs 15%), percentages, the number of people who voted vs people who did not vote, the proportion of plants that died vs survived an experimental treatment, etc.

If you have a continuous variable that you want to compare to an expected population, you may want to use a Single Sample Z-Test.

Two Options

Your categorical variable should have only two possible options. Some examples of variables like this are made a purchase (yes/no), color (if just black/white), recovered from disease (yes/no).

If you have more than two options and more than 10 in a cell, you should consider using the Chi-Square Goodness of Fit Test

More than 10 in each Cell

The rule-of-thumb we recommend is to use this test when you have around 10 or more observations in each cell. “Cell” in this case refers simply to the count of values in each group. For example, if I have a list of survey responses with 5 “yes” and 1 “no”, there are 5 and 1 value(s) per cell, respectively.

If you have less than 10 in a cell, we recommend using the Binomial Test. And if you have more than 10 in every cell and more than 1000 total observations, we recommend using the G-Test of Goodness of Fit.


One-Proportion Z-Test Example

Variable: Gender (male/female)

In this example, we are interested in investigating whether our sample of subjects’ genders differ significantly from a known population proportion of 50-50. The null hypothesis is that there is no difference between the proportion of females (or males). Because we have a random sample, our data points are independent, our groups are mutually exclusive, and we are drawing our sample from a large enough population, we can proceed with the One-Proportion Z-Test.

The analysis will result in a chi-square statistic and a p-value. The p-value represents the chance of seeing our results if there was an actual split of 50-50 in the population. A p-value less than or equal to 0.05 means that our result is statistically significant and we can trust that the difference is not due to chance alone.

x