How to use SAS’s WHERE option with SET

The WHERE option of the SET statement in SAS is used to subset observations from the data set to include in the data set being created. It uses a logical expression to determine which observations to include in the new data set. The WHERE option in the SET statement can be used to subset observations based on any combination of variables and logical operators. This is a powerful tool for manipulating and analyzing data in SAS.


You can use the WHERE option with SET in SAS to create a new dataset that only includes rows from another dataset where certain conditions are met.

Here are two common ways to use this option in practice:

Method 1: Use WHERE and SET with One Condition

data new_data;
    set my_data (where = (points>20));
run;

This example creates a new dataset called new_data that only includes rows from my_data where the value in the points column is greater than 20.

Method 2: Use WHERE and SET with Multiple Conditions

data new_data;
    set my_data (where = (points>20 or team="Rockets"));
run;

This example creates a new dataset called new_data that only includes rows from my_data where the value in the points column is greater than 20 or the value in the team column is equal to Rockets.

The following examples show how to use each method in practice with the following dataset in SAS:

/*create dataset*/
data my_data;
    input team $ points assists;
    datalines;
Mavs 22 10
Rockets 12 14
Spurs 29 8
Kings 13 10
Warriors 44 10
Heat 18 8
Magic 11 5
Pelicans 19 3
Blazers 12 8
;
run;

/*view dataset*/
proc print data=my_data;

Example 1: Use WHERE and SET with One Condition

We can use the following syntax to create a new dataset called new_data that only includes rows from my_data where the value in the points column is greater than 20:

/*create new dataset*/
data new_data;
    set my_data (where = (points>20));
run;

/*view new dataset*/
proc print data=new_data;

Notice that only the rows with a value greater than 20 in the points column are included in this dataset.

Example 2: Use WHERE and SET with Multiple Conditions

/*create new dataset*/
data new_data;
    set my_data (where = (points>20 or team="Rockets"));
run;

/*view new dataset*/
proc print data=new_data;

Notice that only the rows with a value greater than 20 in the points column or a value of “Rockets” in the team column are included in this dataset.

Note: You could use and instead of or to only include rows that meet multiple conditions.

The following tutorials explain how to perform other common tasks in SAS:

x