What is Somers’ D and can you provide a definition and example?

Somers’ D is a statistical measure used to assess the strength and direction of the relationship between two ordinal variables. It is calculated by comparing the differences between the ranks of the two variables and can range from -1 to 1, with 0 indicating no relationship and 1 indicating a perfect positive relationship. For example, Somers’ D can be used to determine if there is a correlation between a person’s age and their level of education, with a higher D value indicating a stronger relationship between the two variables.

What is Somers’ D? (Definition & Example)


Somers’ D, short for Somers’ Delta, is a measure of the strength and direction of the association between an ordinal dependent variable and an ordinal independent variable.

An ordinal variable is one in which the values have a natural order (e.g. “bad”, “neutral”, “good”).

The value for Somers’ D ranges between -1 and 1 where:

  • -1: Indicates that all pairs of the variables disagree
  • 1: Indicates that all pairs of the variables agree

Somers’ D is used in practice for many nonparametric statistical methods.

Somers’ D: Definition

Given two variables, X and Y, we can say :

  • Two pairs (xi, yi) and (xj, yj) are concordant if the ranks of both elements agree.
  • Two pairs (xi, yi) and (xj, yj) are discordantif the ranks of both elements agree.

We can then calculate Somers’ D using the following formula:

Somers’ D = (NC – ND) / (NC + ND + NT)

where:

  • NC: The number of concordant pairs
  • ND: The number of discordant pairs
  • NT: The number of tied pairs

The resulting value will always be between -1 and 1.

Somers’ D: Example in R

Suppose a grocery store would like to assess the relationship between the following two ordinal variables:

  • The overall niceness of the cashier (ranked from 1 to 3)
  • The overall satisfaction of the customer’s experience (also ranked from 1 to 3)

They collect the following ratings from a of 10 customers:

Somers' D example

To quantify the relationship between the two variables, we can calculate Somers’ D using the following code in R:

#enter data
nice <- c(1, 1, 1, 2, 2, 2, 3, 3, 3, 3)
satisfaction <- c(2, 2, 1, 2, 3, 2, 2, 3, 3, 3)

#load DescTools package
library(DescTools)

#calculate Somers' D
SomersDelta(nice, satisfaction)

[1] 0.6896552

Somers’ D turns out to be 0.6896552.

Since this value is fairly close to 1, this indicates that there is a fairly strong positive relationship between the two variables.

This makes sense intuitively: Customers who rate the cashiers as nicer also tend to rate their overall satisfaction higher.

Additional Resources

x