How to Perform an Inner Join in SAS (With Example)

An inner join in SAS is a type of join operation that combines two tables based on matching values in a common column. The syntax for an inner join in SAS includes the merge statement followed by the tables to be merged, their common columns, and the inner statement. An example of an inner join in SAS is the merge statement below, which merges the table ‘sales’ and ‘customer’ based on their common column ‘cust_id’: merge sales customer (in=in_sales); by cust_id; if in_sales; This statement produces a new combined table with all records where the value of ‘cust_id’ in both ‘sales’ and ‘customer’ matched.


You can use the following basic syntax to perform an inner join with two datasets in SAS:

proc sql;
    create table final_table as
    select * from data1 as x join data2 as y
    on x.ID = y.ID;
quit;

The following example shows how to use this syntax in practice.

Related:

Example: Inner Join in SAS

Suppose we have the following two datasets in SAS:

/*create datasets*/
data data1;
    input team $ points;
    datalines;
Mavs 99
Spurs 93
Rockets 88
Thunder 91
Warriors 104
Cavs 93
Nets 90
Hawks 91
;
run;

data data2;
    input team $ rebounds;
    datalines;
Mavs 21
Spurs 18
Warriors 27
Hawks 29
Knicks 40
Raptors 30
;
run;

/*view datasets*/
proc print data=data1;
proc print data=data2;

Notice that the two datasets share one variable in common: team.

We will use the following syntax to perform an inner join and create a new dataset that contains only the rows in which the team variable shows up in both datasets:

/*perform inner join*/
proc sql;
	create table final_table as
	select * from data1 as x join data2 as y
	on x.team = y.team;
quit;

/*view results of inner join*/
proc print data=final_table;

The resulting dataset contains only the rows in which the team variable appeared in both datasets.

If you refer to the two datasets from earlier, you’ll notice that there are only four teams that appear in both datasets: Mavs, Spurs, Warriors, and Hawks.

Since we chose to join the two datasets on the team variable, these are the four teams that also appear in the final dataset.

The following tutorials explain how to perform other common tasks in SAS:

x