How to create dummy variables in SAS?

Creating dummy variables in SAS is a relatively straightforward process. To do so, you should use the “proc format” command to define the variables you want to create and then use the “proc tabulate” command to tabulate the data by the created variables. Finally, you can use the “output” statement to save the tabulated data as a new dataset. This new dataset will contain the dummy variables you just created.


A is a type of variable that we create in regression analysis so that we can represent a categorical variable as a numerical variable that takes on one of two values: zero or one.

For example, suppose we have the following dataset and we would like to use age and marital status to predict income:

To use marital status as a predictor variable in a regression model, we must convert it into a dummy variable.

Since it is currently a categorical variable that can take on three different values (“Single”, “Married”, or “Divorced”), we need to create k-1 = 3-1 = 2 dummy variables.

To create this dummy variable, we can let “Single” be our baseline value since it occurs most often. Thus, here’s how we would convert marital status into dummy variables:

The following example shows how to create dummy variables for this exact dataset in SAS.

Example: Creating Dummy Variables in SAS

First, let’s create the following dataset in SAS:

/*create dataset*/
data original_data;
    input income age status $;
    datalines;
45 23 single
48 25 single
54 24 single
57 29 single
65 38 married
69 36 single
78 40 married
83 59 divorced
98 56 divorced
104 64 married
107 53 married
;
run;

/*view dataset*/
proc print data=original_data;

Next, we can use two IF-THEN-ELSE statements to create dummy variables for the status variable:

/*create new dataset with dummy variables*/
data new_data;
	set original_data;
	if status = "married" then married = 1;
	  else married = 0;
	if status = "divorced" then divorced = 1;
	  else divorced = 0;
run;

/*view new dataset*/
proc print data=new_data;

dummy variables in SAS example

Notice that the values for the two dummy variables (married and divorced) match the values we calculated in the introductory example.

We could then use these dummy variables in a if we’d like since they’re both numeric.

The following tutorials explain how to perform other common tasks in SAS:

x