What are the advantages and disadvantages of using median in statistics?

The median is a statistical measure that represents the middle value in a set of data. It is often used as an alternative to the mean in cases where extreme values may skew the results. The advantages of using median include its ability to provide a more accurate representation of the central tendency of the data and its resistance to outliers. This makes it a useful measure in situations where the data is heavily skewed or contains extreme values. However, one disadvantage of using median is that it does not take into account all the values in a data set, which can result in loss of information. Additionally, it may not be as sensitive as the mean in detecting changes or trends in the data. Therefore, the choice of using median or mean in statistics depends on the nature of the data and the research objectives.

Advantages & Disadvantages of Using Median in Statistics


The median represents the middle value of a dataset.

It is calculated by arranging all of the observations in a dataset from smallest to largest and then identifying the middle value.

There are two main advantages of using the median to describe the center of a dataset:

Advantage #1: The median is not affected by outliers. Since the median only finds the middle value of a dataset, it isn’t affected by extremely small or large values on either end of a dataset.

Advantage #2: The median is a good measure of center for skewed datasets. When a dataset is skewed to the , the median still does a good job of identifying the center value of a dataset, unlike the mean which is heavily affected by skewed distributions.

However, there are two potential disadvantages of using the median to summarize a dataset:

Disadvantage #1: The median does not use all of the observations in a dataset in its calculation. In statistics, we usually say it’s a good thing if we can use all in a dataset because then we are using all of the available information from our data. However, the median does not consider the information from extremely small or large values in a dataset.

Disadvantage #2: The median cannot be used to find the sum of all observations in the dataset. If we know the mean and the total sample size of a dataset, we can find the sum of all values in the dataset. However, we cannot do the same with the median.

The following examples illustrate these advantages and disadvantages in practice.

Example 1: The Advantages of Using the Median

Suppose we have a distribution of salaries that is right skewed and we decide to calculate both the mean and median salary:

The mean tells us that the typical individual earns about $47,000 per year while the median tells us that the typical individual only earns about $32,000 per year, which is much more representative of the typical individual.

In this example, the mean is affected by the higher values on the right tail of the distribution while the median is not.

Or suppose we have another distribution that contains information about the square footage of houses on a certain street and we decide to calculate both the mean and median of the dataset:

When to use the mean vs. the median

The mean is influenced by a couple extremely large houses, which causes it to take on a much larger value.

Example 2: The Disadvantages of Using the Median

Recall the first potential disadvantage of the median:

Disadvantage #1: The median does not use all of the observations in a dataset in its calculation.

For example, suppose we have the following dataset that shows the distribution of exam scores for students in a class:

Scores: 68, 70, 71, 75, 78, 82, 83, 83, 85, 90, 91, 91, 92

The median exam score is 83.

Now suppose we have the same dataset but the lowest three exam scores are much lower:

Scores: 22, 35, 38, 75, 78, 82, 83, 83, 85, 90, 91, 91, 92

The median exam score in this distribution is still 83.

This is why we say the median does not use all of the available information in a dataset: It doesn’t take into account the actual values of the data since it is only a measure of position.

Now recall the second potential disadvantage of the median:

Disadvantage #2: The median cannot be used to find the sum of all observations in the dataset.

Suppose we have the following dataset that contains information about the total sales made by 11 different employees during a particular quarter:

Sales: 12, 12, 15, 19, 22, 24, 28, 30, 32, 35, 38

We know the median value is 24 and we know that there are 11 total employees. However, we can’t use this information to find the total sum of sales for all the employees.

By contrast, if we knew that the mean value was 24 and there were 11 total employees, we could simply multiply 24 by 11 to find that the total sum of sales is 24 * 11 = 264.

Note: Depending on the distribution of your data and the problem you’re trying to solve, the mean or median could turn out to be the preferred metric to use.

Additional Resources

The following tutorials provide additional information about the mean and median in statistics:

x