SAS: How can I use Proc Univariate by Group?

SAS Proc Univariate by Group is a procedure that allows you to analyze the distributional characteristics of a variable for each group of observations. It can be used to compare the distributions of the same variable across different groups, or it can be used to compare the distributions of different variables within a group. It also allows for the calculation of descriptive statistics and box plots for each group of observations.


You can use proc univariate in SAS with the by statement to calculate descriptive statistics for each numeric variable in a dataset, grouped by a particular variable.

This procedure uses the following basic syntax:

proc univariate data=my_data normal;
    by group_variable;
run;

The following example shows how to use this procedure in practice.

Example: Proc Univariate by Group in SAS

Suppose we have the following dataset in SAS that contains information about various basketball players:

/*create dataset*/
data my_data;
    input team $ points rebounds;
    datalines;
A 12 8
A 12 8
A 12 8
A 23 9
A 20 12
A 14 7
A 14 7
B 20 2
B 20 5
B 29 4
B 14 7
B 20 2
B 20 2
B 20 5
;
run;

/*view dataset*/
proc print data=my_data;

We can use proc univariate with the by statement to calculate descriptive statistics for the points and rebounds variables, grouped by the team variable:

proc univariate data=my_data;
    by team;
run;

This procedure will produce the following results:

  • Descriptive statistics for points for team A
  • Descriptive statistics for rebounds for team B
  • Descriptive statistics for points for team A
  • Descriptive statistics for rebounds for team B

Here is what the descriptive statistics looks like for the points variable for team A:

If you only want to calculate descriptive statistics for one specific variable grouped by another variable, then you can use the var statement.

For example, you can use the following syntax to calculate descriptive statistics only for the points variable, grouped by the team variable:

proc univariate data=my_data;
    var points;
    by team;
run;

Feel free to specify as many variables as you’d like in both the var and by statements to calculate descriptive statistics for whichever variables you’d like.

The following tutorials explain how to perform other common tasks in SAS:

x