SAS: How to merge a dataset to another dataset, if a not b

The MERGE statement in SAS can be used to combine two datasets by matching values in a common variable. The “if a not b” option allows you to keep all observations from dataset A that do not have a corresponding observation in dataset B. This option is useful when you want to keep all the observations from one dataset and only select the observations from the other dataset that meet certain criteria.


You can use the following basic syntax to merge two datasets in SAS and only return the rows where a value exists in the first dataset and not the second dataset:

data final_data;
  merge data1 (in = a) data2 (in = b);
  by ID;
  if a and not b;
run;

This particular example merges the datasets called data1 and data2 and only returns the rows where a value exists in data1 and not in data2.

The following example shows how to use this syntax in practice.

Example: Merge Two Datasets in SAS Using If A Not B

Suppose we have the following two datasets in SAS that contain information about sales associates at some company:

/*create first dataset*/
data data1;
    input ID Gender $;
    datalines;
1 Male
2 Male
3 Female
4 Male
5 Female
;
run;

title "data1";
proc print data = data1;

/*create second dataset*/
data data2;
    input ID Sales;
    datalines;
1 22
2 15
4 29
6 31
7 20
8 13
;
run;

title "data2";
proc print data = data2;

If we use the following merge statement to merge the two datasets based on the value in the ID column, all rows will be returned from both datasets:

/*perform merge*/
data final_data;
  merge data1 data2;
  by ID;
run;

/*view results*/
title "final_data";
proc print data=final_data;

However, we can use IN statements to only return the rows where a value exists in the first dataset and not in the second dataset:

data final_data;
  merge data1 (in = a) data2 (in = b);
  by ID;
  if a and not b;
run;

/*view results*/
title "final_data";
proc print data=final_data;

Notice that only the rows where a value exists in the first dataset and not the second dataset are returned.

Note: You can find the complete documentation for the SAS merge statement .

x