Table of Contents

Split Plot ANOVA is a statistical method used to analyze data with two independent variables, one of which is a within-subject variable and the other is a between-subject variable. This method is commonly used in experimental designs where the levels of the within-subject variable are measured multiple times and the levels of the between-subject variable are only measured once. It allows for the analysis of both the main effects of each variable as well as their interaction effect, providing a more comprehensive understanding of the data. This method is particularly useful in agricultural and industrial research, where it can be used to compare the effects of different treatments on the same subject over time.

## What is a Split Plot ANOVA?

The **Split Plot****ANOVA** is a statistical test used to determine if 2 or more repeated measures from 2 or more groups are significantly different from each other on your variable of interest. Your variable of interest should be continuous, be normally distributed, and have a similar spread across your groups. You should have repeated measures from the same units of observation (e.g. subject, store, location) and you should have enough data (more than 5 values in each group).

*The Split Plot**ANOVA is also sometimes called a Mixed Design ANOVA, Mixed Design Analysis of Variance, Two-Way Repeated Measures ANOVA (special case), or Mixed-Model Design ANOVA.*

## Assumptions for the Split Plot ANOVA

Every statistical method has assumptions. Assumptions mean that your data must satisfy certain properties in order for statistical method results to be accurate.

The assumptions for the Split Plot ANOVA include:

- Continuous
- Normally Distributed
- Random Sample
- Enough Data
- Sphericity
- Similar Spread across Groups

Let’s dive in to each one of these separately.

**Continuous**

The variable that you care about (and want to see if it is different across the 3+ groups) must be continuous. Continuous means that the variable can take on any reasonable value.

Some good examples of continuous variables include age, weight, height, test scores, survey scores, yearly salary, etc.

**Normally Distributed**

The variable that you care about must be spread out in a normal way. In statistics, this is called being normally distributed (aka it must look like a bell curve when you graph the data). Only use a Split Plot ANOVA with your data if the variable you care about is normally distributed.

*If your variable is not normally distributed, you should use the Friedman Test instead.*

**Random Sample**

The data points for each group in your analysis must have come from a simple random sample. This means that if you wanted to see if drinking sugary soda makes you gain weight, you would need to randomly select a group of soda drinkers for your soda drinker group.

The key here is that the data points for each group were randomly selected. This is important because if your groups were not randomly determined then your analysis will be incorrect. In statistical terms this is called bias, or a tendency to have incorrect results because of bad data.

*If you do not have a random sample, the conclusions you can draw from your results are very limited. You should try to get a simple random sample.**If you have independent samples (3 measurements from different, unrelated groups) then you should use a One-Way ANOVA instead.*

**Enough Data**

The sample size (or data set size) should be greater than 5 in each group. Some people argue for more, but more than 5 is probably sufficient.

The sample size also depends on the expected size of the difference across groups. If you expect a large difference across groups, then you can get away with a smaller sample size. If you expect a small difference across groups, then you likely need a larger sample.

**Sphericity**

In statistics this refers to the idea that the variances of the differences between each possible pair of repeated measures is the same for each group. For instance, if there are 3 repeated measures, then for each other grouping variable, the variance of (time 1 – time 2) should be the same as the variance of (time 1 – time 3) and so on. This assumption can be tested in most statistical software.

**Similar Spread across Groups**

In statistics this is called homogeneity of variance, or making sure the variable of interest is spread similarly between the two or more non-repeated measures groups (see image below).

## When to use a Split Plot ANOVA?

You should use a Split Plot ANOVA in the following scenario:

- You want to know if many groups are
**different**on your variable of interest - Your variable of interest is
**continuous** - You have
**3 or more groups** - You have
**related samples** - You have a
**normal variable of interest** - You have
**two or more grouping variables**

Let’s clarify these to help you know when to use a Split Plot ANOVA.

**Difference**

You are looking for a statistical test to see whether three or more groups are significantly different on your variable of interest. This is a difference question. Other types of analyses include examining the relationship between two variables (correlation) or predicting one variable using another variable (prediction).

**Continuous Data**

Your variable of interest must be continuous. Continuous means that your variable of interest can basically take on any value, such as heart rate, height, weight, number of ice cream bars you can eat in 1 minute, etc.

Types of data that are NOT continuous include ordered data (such as finishing place in a race, best business rankings, etc.), categorical data (gender, eye color, race, etc.), or binary data (purchased the product or not, has the disease or not, etc.).

**Three or more Groups**

A Split Plot ANOVA can be used to compare three or more related groups on your variable of interest. See below for an explanation of what “related” groups means.

*If you have only two groups, you should use a Paired Samples T-Test analysis instead.*

**Related Samples**

Related samples means that you have repeated measures from the same units of observation. For example, if you have a group of men undergoing a treatment and you measure their cholesterol levels at 3 time points, then you have 3 groups of related data.

*If you have 3 or more independent groups, you should use a One-Way ANOVA instead.*

**Normal Variable of Interest**

Normality was discussed earlier on this page and simply means your plotted data is bell shaped with most of the data in the middle. If you actually would like to prove that your data is normal, you can use the Kolmogorov-Smirnov test or the Shapiro-Wilk test.

**Two or More Grouping Variables**

You should use the Split Plot ANOVA when you have two or more grouping variables. For instance, if we have recovery data for both a treatment and control group at 3 or more points in time, then treatment/control is the grouping variable and a split plot ANOVA is a suitable analysis.

## Split Plot ANOVA Example

**Group Variable 1**: Cardio-based exercise program

**Group Variable 2**: Weights-based exercise program

**Repeated Measures**: Data were collected at month 1, 2 and 3

**Variable of interest**: Cholesterol levels

In this example we have three related groups (the three points in time) and two grouping variables with a continuous variable of interest, so we know to perform a Split Plot ANOVA. After confirming that our variable of interest is normal and our data meet the assumptions of this test, we proceed with the analysis.

The null hypothesis, which is statistical lingo for what would happen if the exercise programs do nothing, is that none of the groups or time points will have different cholesterol levels, on average. We are trying to determine the likelihood that this is true.

After the experiment is over, we compare the two grouping variables over time on our variable of interest (cholesterol levels) using a Split Plot ANOVA. When we run the analysis, we get some F-statistics and a p-values. The F-statistic generally represents the size of the effect that it’s describing, and the p-value is the chance of seeing our results assuming that there is no real effect. A p-value less than or equal to 0.05 means that our result is statistically significant and we can trust that the difference is not due to chance alone.

Typically, the results of this analysis are broken down into a few parts. First, there is a “within subjects” effect. This addresses the question — did cholesterol levels in both groups change over time? Second there is a “between subjects” effect that asks — were cholesterol levels different between the two exercise program groups. We’re most interested in the “interaction” effect, which asks — did the cholesterol levels change differently over time in the two different groups.

If the p-value of this interaction effect is small, then we have evidence that the exercise programs had different effects over time on cholesterol levels. Further investigation is required to determine which group had a larger effect and at which time points.