What is Cohen’s Kappa statistic and can you provide an example?

Cohen’s Kappa statistic is a measure of agreement between two raters or evaluators when assessing the same data. It takes into account the possibility of agreement occurring by chance. It is commonly used in inter-rater reliability studies to determine the level of agreement between two raters. For example, if two doctors are evaluating the severity of a patient’s illness on a scale of 1-10, Cohen’s Kappa statistic can be used to determine the level of agreement between their ratings. A higher kappa value indicates a higher level of agreement between the raters, while a lower value suggests a lack of agreement. This statistic is useful in various fields such as psychology, medicine, and education to assess the consistency and reliability of data interpretation.

Cohen’s Kappa Statistic: Definition & Example


Cohen’s Kappa Statistic is used to measure the level of agreement between two raters or judges who each classify items into mutually exclusive categories.

The formula for Cohen’s kappa is calculated as:

k = (po – pe) / (1 – pe)

where:

  • po: Relative observed agreement among raters
  • pe: Hypothetical probability of chance agreement

Rather than just calculating the percentage of items that the raters agree on, Cohen’s Kappa attempts to account for the fact that the raters may happen to agree on some items purely by chance.

How to Interpret Cohen’s Kappa

Cohen’s Kappa always ranges between 0 and 1, with 0 indicating no agreement between the two raters and 1 indicating perfect agreement between the two raters.

The following table summarizes how to interpret different values for Cohen’s Kappa:

Cohen's Kappa

The following step-by-step example shows how to calculate Cohen’s Kappa by hand.

Calculating Cohen’s Kappa: Step-by-Step Example

Suppose two museum curators are asked to rate 70 paintings on whether they’re good enough to be hung in a new exhibit.

The following 2×2 table shows the results of the ratings:

Example of calculating Cohen's Kappa

Step 1: Calculate relative agreement (po) between raters.

First, we’ll calculate the relative agreement between the raters. This is simply the proportion of total ratings that the raters both said “Yes” or both said “No” on. 

  • po = (Both said Yes + Both said No) / (Total Ratings)
  • po = (25 + 20) / (70) = 0.6429

Step 2: Calculate the hypothetical probability of chance agreement (pe) between raters.

Next, we’ll calculate the probability that the raters could have agreed purely by chance.

This is calculated as the total number of times that Rater 1 said “Yes” divided by the total number of responses, multiplied by the total number of times that Rater 2 said “Yes” divided by the total number of responses, added to the total number of times that Rater 1 said “No” multiplied by the total number of times that Rater 2 said “No.”

For our example, this is calculated as:

  • P(“Yes”) = ((25+10)/70) * ((25+15)/70) = 0.285714
  • P(“No”) = ((15+20)/70) * ((10+20)/70) = 0.214285
  • pe = 0.285714 + 0.214285 = 0.5

Step 3: Calculate Cohen’s Kappa

Lastly, we’ll use po and pe to calculate Cohen’s Kappa:

  • k = (po – pe) / (1 – pe)
  • k = (0.6429 – 0.5) / (1 – 0.5)
  • k = 0.2857

Cohen’s Kappa turns out to be 0.2857. Based on the table from earlier, we would say that the two raters only had a “fair” level of agreement.

Additional Resources

You can use this to automatically calculate Cohen’s Kappa for any two raters.

x