How can I use rcorr in R to generate a correlation matrix?

The rcorr function in R can be used to generate a correlation matrix by first importing the necessary data into R. Once the data is loaded, the rcorr function can be applied to the data, specifying the desired method of correlation (such as Pearson or Spearman). This will produce a matrix with the correlation coefficients for each pair of variables in the dataset. The matrix can then be visualized or further analyzed to understand the relationship between variables in the dataset.


You can use the rcorr function from the Hmisc package in R to create a matrix of correlation coefficients along with a matrix of p-values for variables in a data frame.

This function is particularly useful because the matrix of allows you to see if the correlation coefficient between different pairwise combinations of variables is statistically significant.

This function uses the following basic syntax:

library(Hmisc)

#create matrix of correlation coefficients and matrix of p-values
rcorr(as.matrix(df))

The following example shows how to use the rcorr function in practice.

Example: How to Use rcorr Function in R

Suppose we have the following data frame in R that contains information about various basketball players:

#create data frame
df <- data.frame(assists=c(4, 5, 5, 6, 7, 8, 8, 10),
                 rebounds=c(12, 14, 13, 7, 8, 8, 9, 13),
                 points=c(22, 24, 26, 26, 29, 32, 20, 14),
                 steals=c(5, 6, 7, 7, 8, 5, 3, 4))

#view data frame
df

  assists rebounds points steals
1       4       12     22      5
2       5       14     24      6
3       5       13     26      7
4       6        7     26      7
5       7        8     29      8
6       8        8     32      5
7       8        9     20      3
8      10       13     14      4

We can use the following syntax to create a matrix of correlation coefficients and a matrix of corresponding p-values for this data frame:

library(Hmisc)

#create matrix of correlation coefficients and matrix of p-values
rcorr(as.matrix(df))

         assists rebounds points steals
assists     1.00    -0.24  -0.33  -0.47
rebounds   -0.24     1.00  -0.52  -0.17
points     -0.33    -0.52   1.00   0.61
steals     -0.47    -0.17   0.61   1.00

n= 8 


P
         assists rebounds points steals
assists          0.5589   0.4253 0.2369
rebounds 0.5589           0.1844 0.6911
points   0.4253  0.1844          0.1047
steals   0.2369  0.6911   0.1047 

The first matrix shows the correlation coefficient between each pairwise combination of variables in the data frame.

For example, we can see:

  • The correlation coefficient between assists and rebounds is -0.24.
  • The correlation coefficient between assists and points is -0.33.
  • The correlation coefficient between assists and steals is -0.47.

And so on.

The second matrix shows the corresponding p-value for each correlation coefficient from the first matrix.

For example, we can see:

  • The p-value for the correlation coefficient between assists and rebounds is 0.5589.
  • The p-value for the correlation coefficient between assists and points is 0.4253.
  • The p-value for the correlation coefficient between assists and steals is 0.2369.

Note: By default, the rcorr function calculates the Pearson correlation coefficient, but you can specify type=’spearman’ if you would instead like to calculate the .

Additional Resources

The following tutorials explain how to perform other common operations in R:

x