Table of Contents
SAS PROC FREQ and R both provide statistical analysis tools, but they have different programming languages. The equivalent of SAS PROC FREQ in R is the “table” function, which allows for the calculation of frequency tables and related statistics such as Chi-square tests and odds ratios. Like PROC FREQ, the table function provides a comprehensive breakdown of the frequencies and percentages of each category within a variable, making it a useful tool for data exploration and analysis in R.
In SAS, you can use PROC FREQ to calculate frequencies for variables.
The easiest way to replicate this functionality in the R programming language is by using the table function.
The following example shows how to use this function in practice.
Example: How to Use Equivalent of SAS PROC FREQ in R
Suppose we have the following data frame in R that contains information about points scored by basketball players on various teams and positions:
#create data frame df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'), position=c('G', 'G', 'G', 'F', 'G', 'F', 'F', 'C'), points=c(23, 18, 14, 14, 13, 19, 34, 28)) #view data frame df team position points 1 A G 23 2 A G 18 3 A G 14 4 A F 14 5 B G 13 6 B F 19 7 B F 34 8 B C 28
We can use the following code to calculate the frequency of each unique value in the position column:
#calculate frequency of each unique value in 'position' column
table(df$position)
C F G
1 3 4
The output shows the frequency of each unique value in the position column.
For example, we can see:
- The value C occurs 1 time.
- The value F occurs 3 times.
- The value G occurs 4 times.
If you would like to view the frequencies as percentages, you can use the prop.table function as follows:
#calculate frequency percentage of each unique value in 'position' column prop.table(table(df$position)) C F G 0.125 0.375 0.500
The output shows the frequency percentage of each unique value in the position column.
For example, we can see:
- The value C represents 12.5% of all values in the position column.
- The value F represents 37.5% of all values in the position column.
- The value G represents 50% of all values in the position column.
Lastly, if you’d like to create a two-way frequency table in R then you can include the names of two columns in the table function as follows:
#calculate frequencies of values in team and position columns
table(df$team, df$position)
C F G
A 0 1 3
B 1 2 1
The output shows the frequency of each unique combination of values in the team and position columns.
For example, we can see:
- There were 0 occurrences of position C on team A.
- There was 1 occurrence of position F on team A.
- There were 3 occurrences of position G on team A.
And so on.
Additional Resources
The following tutorials explain how to perform other common operations in R: