How to use the IN= Option in SAS?

The IN= option in SAS is used when you want to match observations from one data set to another. It assigns a value of 1 to observations that match and a value of 0 to those that do not. This can be used to create a new data set that includes only those observations that match the criteria in the WHERE statement, or to create a new data set that includes the matched observations along with additional variables from the other data set.


You can use the IN= option in SAS to create a Boolean variable that indicates whether or not the current observation comes from the input dataset.

The IN= option is commonly used when you append two datasets together and you’d like to know if a particular row in the resulting dataset came from one of the specific input datasets.

The following example shows how to use the IN= option in practice.

Example: How to Use the IN= Option in SAS

Suppose we have two datasets that contain information about basketball players in the Eastern and Western conferences of the NBA:

/*create East dataset*/
data east_data;
    input team $ points;
    datalines;
Celtics 22
Pistons 14
Nets 35
Hornets 19
Magic 22
;
run;

/*create West dataset*/
data west_data;
    input team $ points;
    datalines;
Mavs 40
Rockets 39
Warriors 23
Lakers 19
Clippers 25
;
run;

/*view datasets*/
proc print data=east_data;
proc print data=west_data;

We can use the following syntax to create a new dataset that appends these two datasets together:

/*create new dataset*/
data all_data;
    set east_data west_data;
run;

/*view new dataset*/
proc print data=all_data;

Notice that each row from each dataset belongs to the new dataset.

To know which dataset each row came from, we can use the IN= option as follows:

/*create new dataset*/
data all_data;
    set east_data west_data(in=i);
    if i then conf='West';
    else conf='East';
run;

/*view new dataset*/
proc print data=all_data;

By using the IN= option, we are able to create a new column called conf that takes on a value of “East” if the row came from the dataset called east_data and a value of “West” if the row came from the dataset called west_data.

Note that we could also use only an IF statement without an ELSE statement to create a new column that simply takes on a value of * if the row came from the dataset called east_data:

/*create new dataset*/
data all_data;
    set east_data(in=i) west_data;
    if i then east_conf='*';
run;

/*view new dataset*/
proc print data=all_data;

By using the IN= option, we are able to create a new column called east_conf that takes on a value of  * if the row came from the dataset called east_data and no value if the row came from the dataset called west_data.

The following tutorials explain how to perform other common tasks in SAS:

x