What is the definition of measures of dispersion and what are some examples?

Measures of dispersion, also known as variability or spread, refer to statistical tools used to describe the spread or variability of a data set. They provide a numerical representation of how far apart the data points are from each other.

Examples of measures of dispersion include range, variance, and standard deviation. Range is the difference between the highest and lowest values in a data set. Variance is the average of the squared differences from the mean. Standard deviation is the square root of the variance and measures the average distance of the data points from the mean. Other measures of dispersion include interquartile range, mean absolute deviation, and coefficient of variation.

Overall, measures of dispersion are important in analyzing and interpreting data, as they provide a more comprehensive understanding of the data set and its distribution. They help in identifying outliers, comparing the spread of different data sets, and making informed decisions based on the variability of the data.

Measures of Dispersion: Definition & Examples


When we analyze a dataset, we often care about two things:

1. Where the “center” value is located. We often measure the “center” using the mean and median.

2. How “spread out” the values are. We measure “spread” using range, interquartile range, variance, and standard deviation

Range

The range is the difference between the largest and smallest value in a dataset.

Suppose we have this dataset of final math exam scores for 20 students:

How to find standard deviation and variance of a dataset


The largest value is 98. The smallest value is 58. Thus, the range is 98 – 58 = 40.

Interquartile Range

The interquartile range is the difference between the first quartile and the third quartile in a dataset.

Quartiles are values that split up a dataset into four equal parts. Here is how to find the interquartile range of the following dataset of exam scores:

How to find standard deviation and variance of a dataset

1. Arrange the values from smallest to largest.

58, 66, 71, 73, 74, 77, 78, 82, 84, 85, 88, 88, 88, 90, 90, 92, 92, 94, 96, 98

2. Find the median. (In this case, it’s the average of the middle two values)

58, 66, 71, 73, 74, 77, 78, 82, 84, 85 (MEDIAN) 88, 88, 88, 90, 90, 92, 92, 94, 96, 98

3. The median splits the dataset into two halves. The median of the lower half is the lower quartile (Q1) and the median of the upper half is the upper quartile (Q3)

58, 66, 71, 73, 74, 77, 78, 82, 84, 85,88, 88, 88, 90, 90, 92, 92, 94, 96, 98

In this case, Q1 is the average of the middle two values in the lower half of the data set (75.5) and Q3 is the average of the middle two values in the upper half of the data set(91).

Thus, the interquartile range is 91 – 75.5 = 15.5

Interquartile Range vs. Range

The interquartile range more resistant to outliers compared to the range, which can make it a better metric to use to measure “spread.”

For example, suppose we have the following dataset with incomes for ten people:

Comparing the range to the interquartile range
The range is $2,468,000, but the interquartile range is $34,000, which is a much better indication of how spread out the incomes actually are.

In this case, the outlier income of person J causes the range to be extremely large and makes it a poor indicator of “spread” for these incomes.

Variance

The variance is a common way to measure how spread out data values are.

The formula to find the variance of a population (denoted as σ2) is:

σ2 = Σ (xi – μ)2 / N

where μ is the population mean, xi is the ith element from the population, N is the population size, and Σ is just a fancy symbol that means “sum.”

Usually we work with samples, not populations. And the formula to find the variance of a sample (denoted as s2) is:

s2 = Σ (xix)2 / (n-1)

Standard Deviation

The standard deviation is the square root of the variance. It’s the most common way to measure how “spread out” data values are.

The formula to find the standard deviation of a population (denoted as σ ) is:

Σ (xi – μ)2 / N

And the formula to find the standard deviation of a sample (denoted as s) is:

√Σ (xi – x)2 / (n-1)

x