Table of Contents
In R, when you have a matrix or data frame “X” and you try to get its dimensions with the dim(X) command, the resulting vector must have a positive length. If the length of the vector is not positive, then the command has failed and the matrix/data frame needs to be fixed before you can get the dimensions. To fix the matrix/data frame, you need to check and make sure that all of the data within it is valid and there are no empty cells. Once the matrix/data frame is corrected, you can then try the dim(X) command again to get its dimensions.
One error you may encounter in R is:
Error in apply(df$var1, 2, mean) : dim(X) must have a positive length
This error occurs when you attempt to use the apply() function to calculate some metric for a column of a data frame or matrix, yet provide a vector as an argument instead of a data frame or matrix.
This tutorial shares exactly how to fix this error.
How to Reproduce the Error
Suppose we have the following data frame in R:
#create data frame
df <- data.frame(points=c(99, 97, 104, 79, 84, 88, 91, 99),
rebounds=c(34, 40, 41, 38, 29, 30, 22, 25),
blocks=c(12, 8, 8, 7, 8, 11, 6, 7))
#view data frame
df
points rebounds blocks
1 99 34 12
2 97 40 8
3 104 41 8
4 79 38 7
5 84 29 8
6 88 30 11
7 91 22 6
8 99 25 7
Now suppose we attempt to use the apply() function to calculate the mean value in the ‘points’ column:
#attempt to calculate mean of 'points' column
apply(df$points, 2, mean)
Error in apply(df$points, 2, mean) : dim(X) must have a positive length
An error occurs because the apply() function must be applied to a data frame or matrix, yet in this example we attempt to apply it to a specific column in the data frame.
How to Fix the Error
The way to fix this error is to simply provide the name of the data frame to the apply() function as follows:
#calculate mean of every column in data frame
apply(df, 2, mean)
points rebounds blocks
92.625 32.375 8.375
From the output, we can see the mean value of each column in the data frame. For example, the mean value of the ‘points’ column is 92.625.
We can also use this function to only find the mean of specific values in the data frame:
#calculate mean of 'points' and 'blocks' column in data frame
apply(df[c('points', 'blocks')], 2, mean)
points blocks
92.625 8.375
Lastly, if we’d like to find the mean of just one column then we can use the mean() function without using the apply() function at all:
#calculate mean of 'points' column
mean(df$points)
[1] 92.625
The following tutorials explain how to troubleshoot other common errors in R:
How to Fix in R: longer object length is not a multiple of shorter object length