Use the KEEP and DROP Statements in SAS?

The KEEP and DROP statements in SAS are used to select variables in a dataset for further analysis. The KEEP statement is used to retain certain variables from the dataset, while the DROP statement is used to discard certain variables from the dataset. These statements can be used to create new datasets that contain only the variables that are relevant to the analysis. By using the KEEP and DROP statements, researchers can avoid manipulating large datasets and can quickly focus their analysis on the relevant variables.


You can use the KEEP and DROP statements in SAS when creating a new dataset to keep or drop specific variables from an existing dataset.

These statements use the following basic syntax:

Method 1: Choose Which Columns to KEEP

data new_data;
    set original_data;
    keep var1 var3;
run;

Method 2: Choose Which Columns to DROP

data new_data;
    set original_data;
    drop var5;
run;

The following examples show how to use each method with the following dataset in SAS:

/*create dataset*/
data original_data;
    input team $ points rebounds;
    datalines;
Warriors 25 8
Wizards 18 12
Rockets 22 6
Celtics 24 11
Thunder 27 14
Spurs 33 19
Nets 31 20
;
run;

/*view dataset*/
proc print data=original_data;

Example 1: Choose Which Columns to KEEP

The following code shows how to create a new dataset in SAS by using the KEEP statement to only keep the team and rebounds variables from the original dataset:

/*create new dataset*/
data new_data;
    set original_data;
    keep team rebounds;
run;

/*view new dataset*/
proc print data=new_data;

Notice that we kept the team and rebounds variables from the original dataset and any other variables were simply dropped.

Example 2: Choose Which Columns to DROP

The following code shows how to create a new dataset in SAS by using the DROP statement to drop the rebounds variable from the original dataset:

/*create new dataset*/
data new_data;
    set original_data;
    drop rebounds;
run;

/*view new dataset*/
proc print data=new_data;

Notice that the rebounds variable was dropped from the dataset while all other variables from the original dataset were kept.

KEEP vs. DROP: Which Statement Should You Use?

The KEEP and DROP statements accomplish the same outcome: They both allow you to create a new dataset by keeping or dropping certain variables from an existing dataset.

Typically we use the KEEP statement when we only want to keep a few variables from the original dataset since this is faster than typing out all of the variables we’d like to drop.

Conversely, we typically use the DROP statement when we only want to drop a few variables from the original dataset since this is faster than typing out all of the variables we’d like to keep.

The following tutorials explain how to perform other common tasks in SAS:

x