How to Create New Variables in SAS? (With Examples)

Creating new variables in SAS is a straightforward process. To create a new variable, you can use the DATA statement to assign a name to the variable, the LENGTH statement to specify the length of the variable, and the FORMAT statement to format the output. You can also use the assignment operator to assign a value to the variable. Examples of creating new variables in SAS include using the RETAIN statement to increment values, the ARRAY statement to assign values to multiple variables, and the LAG statement to retain values from the previous observation.


Here are the two most common ways to create new variables in SAS:

Method 1: Create Variables from Scratch

data original_data;
    input var1 $ var2 var3;
    datalines;
A 12 6
B 19 5
C 23 4
D 40 4
;
run;

Method 2: Create Variables from Existing Variables

data new_data;
    set original_data;
    new_var4 = var2 / 5;
    new_var5 = (var2 + var3) * 2;
run;

The following examples show how to use each method in practice.

Related:

Example 1: Create Variables from Scratch

The following code shows how to create a dataset with three variables: team, points, and rebounds:

/*create dataset*/
data original_data;
    input team $ points rebounds;
    datalines;
Warriors 25 8
Wizards 18 12
Rockets 22 6
Celtics 24 11
Thunder 27 14
Spurs 33 19
Nets 31 20
;
run;

/*view dataset*/
proc print data=original_data;

Note that you can simply list the variable names after the input function and you can create their values from scratch after the datalines function. 

Note: SAS assumes each new variable is numeric. To create a character variable, simply type a dollar sign “$” after the variable name like we did for the team variable in this example.

Example 2: Create Variables from Existing Variables

The following code shows how to use the set function to create a new dataset whose variables are created from existing variables in another dataset:

/*create new dataset*/
data new_data;
    set original_data;
    half_points = points / 2;
    avg_pts_rebs = (points + rebounds) / 2;
run;

/*view new dataset*/
proc print data=new_data;

The following tutorials explain how to perform other common tasks in SAS:

x