How can I easily identify outliers in Excel?

The process of identifying outliers in Excel can be easily achieved by utilizing various statistical methods and tools available in the software. These include using the built-in functions such as QUARTILE, IQR, and Z-SCORE, as well as creating visual representations like box plots and scatter plots. By applying these techniques, users can efficiently detect and eliminate any unusual or extreme data points that may significantly impact the overall analysis and results. This allows for more accurate and reliable data analysis in Excel.

Easily Find Outliers in Excel


An outlier is an that lies abnormally far away from other values in a dataset.

Outliers can be problematic because they can affect the results of an analysis.

We will use the following dataset in Excel to illustrate two methods for finding outliers:

Related:

Method 1: Use the interquartile range

The (IQR) is the difference between the 75th percentile (Q3) and the 25th percentile (Q1) in a dataset. It measures the spread of the middle 50% of values.

We can define an observation to be an outlier if it is 1.5 times the interquartile range greater than the third quartile (Q3) or 1.5 times the interquartile range less than the first quartile (Q1).

The following image shows how to calculate the interquartile range in Excel:

Next, we can use the formula mentioned above to assign a “1” to any value that is an outlier in the dataset:

Finding outliers in Excel

We see that only one value – 164 – turns out to be an outlier in this dataset.

Method 2: Use z-scores

A tells you how many standard deviations a given value is from the mean. We use the following formula to calculate a z-score:

z = (X – μ) / σ

where:

  • X is a single raw data value
  • μ is the population mean
  • σ is the population standard deviation

We can define an observation to be an outlier if it has a z-score less than -3 or greater than 3.

The following image shows how to calculate the mean and standard deviation for a dataset in Excel:

We can then use the mean and standard deviation to find the z-score for each individual value in the dataset:

We can then assign a “1” to any value that has a z-score less than -3 or greater than 3:

Finding outliers in Excel using z-scores

Using this method, we see that there are no outliers in the dataset.

Note: Sometimes a z-score of 2.5 is used instead of 3. In this case, the individual value of 164 would be considered an outlier since it has a z-score greater than 2.5.

When using the z-score method, use your best judgement for which z-score value you consider to be an outlier.

How to Handle Outliers

If an outlier is present in your data, you have a few options:

1. Make sure the outlier is not the result of a data entry error.

Sometimes an individual simply enters the wrong data value when recording data. If an outlier is present, first verify that the value was entered correctly and that it wasn’t an error.

2. Remove the outlier.

If the value is a true outlier, you may choose to remove it if it will have a significant impact on your overall analysis. Just make sure to mention in your final report or analysis that you removed an outlier.

3. Assign a new value to the outlier.

If the outlier is the result of a data entry error, you may decide to assign a new value to it such as  of the dataset.

x