Table of Contents
Cramer’s V is a measure of association between two nominal variables. It can be calculated in Python using the pandas.crosstab() method, which takes two columns of data as arguments and returns a cross-tabulated table. The Cramer’s V statistic is then computed as the square root of the chi-square statistic divided by the total sample size, and multiplied by the square root of the chi-square statistic divided by the number of rows or columns (whichever is smaller). This gives us the Cramer’s V statistic, which can then be interpreted to measure the strength of the association between two nominal variables.
Cramer’s V is a measure of the strength of association between two .
It ranges from 0 to 1 where:
- 0 indicates no association between the two variables.
- 1 indicates a strong association between the two variables.
It is calculated as:
Cramer’s V = √(X2/n) / min(c-1, r-1)
where:
- X2: The Chi-square statistic
- n: Total sample size
- r: Number of rows
- c: Number of columns
This tutorial provides a couple examples of how to calculate Cramer’s V for a contingency table in Python.
Example 1: Cramer’s V for a 2×2 Table
The following code shows how to calculate Cramer’s V for a 2×2 table:
#load necessary packages and functions import scipy.stats as stats import numpy as np #create 2x2 table data = np.array([[7,12], [9,8]]) #Chi-squared test statistic, sample size, and minimum of rows and columns X2 = stats.chi2_contingency(data, correction=False)[0] n = np.sum(data) minDim = min(data.shape)-1 #calculate Cramer's V V = np.sqrt((X2/n) / minDim) #display Cramer's V print(V) 0.1617
Cramer’s V turns out to be 0.1617, which indicates a fairly weak association between the two variables in the table.
Example 2: Cramer’s V for Larger Tables
Note that we can use the CramerV function to calculate Cramer’s V for a table of any size.
The following code shows how to calculate Cramer’s V for a table with 2 rows and 3 columns:
#load necessary packages and functions import scipy.stats as stats import numpy as np #create 2x2 table data = np.array([[6,9], [8, 5], [12, 9]]) #Chi-squared test statistic, sample size, and minimum of rows and columns X2 = stats.chi2_contingency(data, correction=False)[0] n = np.sum(data) minDim = min(data.shape)-1 #calculate Cramer's V V = np.sqrt((X2/n) / minDim) #display Cramer's V print(V) 0.1775
Cramer’s V turns out to be 0.1775.