SAS: Use SELECT DISTINCT in PROC SQL what is the procedure to use SELECT DISTINCT in PROC SQL?

SELECT DISTINCT is a clause used in PROC SQL in SAS to remove duplicate rows from the result set of a query. It removes any rows that have the same values in all of the columns that are included in the SELECT clause. This can be useful when you want to obtain a result set that only contains unique values.


You can use the SELECT DISTINCT statement within PROC SQL in SAS to select only unique rows from a dataset.

The following example shows how to use this statement in practice.

Example: Using SELECT DISTINCT in SAS

Suppose we have the following dataset in SAS that contains information about various basketball players:

/*create dataset*/
data my_data;
    input team $ position $ points;
    datalines;
A Guard 14
A Guard 14
A Guard 24
A Forward 13
A Forward 13
B Guard 22
B Guard 22
B Forward 34
C Forward 15
C Forward 18
;
run;

/*view dataset*/
proc print data=my_data;

We can use the SELECT DISTINCT statement within PROC SQL to select all unique rows from the dataset:

/*select all unique rows*/
proc sql;
    select distinct *
    from my_data;
quit;

Note: The star ( * ) symbol after SELECT DISTINCT tells SAS to select all columns in the dataset.

Notice that all unique rows are shown in the output.

For example, there are multiple rows that have a team value of A, position value of Forward and points value of 13 but only one of these rows is shown.

Note that we can also specify which columns we’d like to select:

/*select all unique combinations of team and position*/
proc sql;
    select distinct team, position
    from my_data;
quit;

Notice that only the unique combinations of teams and positions are shown in the output.

x