Table of Contents
PROC CLUSTER is a SAS procedure used for clustering analysis, which is the process of grouping objects into clusters based on their similarities. This procedure allows users to identify patterns and relationships within a dataset, making it useful for data exploration and classification tasks.
One example of using PROC CLUSTER in SAS is to group customer data based on their purchasing behavior. This can help businesses identify different segments of customers and tailor their marketing strategies accordingly. The procedure can be used to cluster customers based on variables such as purchase frequency, amount spent, and types of products purchased. The resulting clusters can then be further analyzed to understand the characteristics and preferences of each group. This information can be used to develop targeted marketing campaigns and improve customer satisfaction.
Use PROC CLUSTER in SAS (With Example)
Clustering is a technique in machine learning that attempts to find clusters of observations within a dataset.
The goal is to find clusters such that the observations within each cluster are quite similar to each other, while observations in different clusters are quite different from each other.
The easiest way to perform clustering in SAS is to use PROC CLUSTER.
The following example shows how to use PROC CLUSTER in practice.
Example: How to Use PROC CLUSTER in SAS
Suppose we have the following dataset that contains information about points, assists and rebounds for 20 different basketball players:
/*create dataset*/
data my_data;
input points assists rebounds;
datalines;
18 3 15
20 3 14
19 4 14
14 5 10
14 4 8
15 7 14
20 8 13
28 7 9
30 6 5
31 9 4
35 12 11
33 14 6
29 9 5
25 9 5
25 4 3
27 3 8
29 4 12
30 12 7
19 5 6
23 11 5
;
run;
/*view dataset*/
proc printdata=my_data;

Suppose we would like to perform clustering to attempt to identify “clusters” of players that have similar stats to each other.
The following code shows how to use PROC CLUSTER in SAS to perform clustering:
/*perform clustering using points, assists and rebounds variables*/
proc clusterdata=my_data method=average; var points assists rebounds;run;The first tables in the output provide information about how the clustering was performed:

A dendrogram is also produced so that we can visually inspect the similarity between observations in the dataset:

The y-axis shows the individual observations and the x-axis shows the average distance between clusters.
From looking at this dendrogram, it appears that the observations naturally group themselves into three clusters:

We can then use the PROC TREE statement with ncl=3 to tell SAS to assign each observation in the original dataset to one of three clusters:
/*assign each observation to one of three clusters*/
proc treedata=clustd noprint ncl=3 out=clusts;
copy points assists rebounds;
id player_ID;
run;
proc sort;
by cluster;
run;
/*view cluster assignments*/
proc printdata=clusts;
id player_ID;
run;The resulting dataset shows each of the original observations along with the cluster they belong to:

For example, we can see: that players with ID’s 2, 3, 1, 4, 5, 7, 6 and 19 all belong to cluster 1.
This tells us that these eight players are “similar” across the points, assists and rebounds variables.
Note: For this example we chose to use average as the linkage method for clustering. Refer to the for a complete list of other linkage methods you can use.
The following tutorials explain how to perform other common tasks in SAS:
Cite this article
stats writer (2024). How can PROC CLUSTER be used in SAS, and can you provide an example?. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/how-can-proc-cluster-be-used-in-sas-and-can-you-provide-an-example/
stats writer. "How can PROC CLUSTER be used in SAS, and can you provide an example?." PSYCHOLOGICAL SCALES, 23 Jun. 2024, https://scales.arabpsychology.com/stats/how-can-proc-cluster-be-used-in-sas-and-can-you-provide-an-example/.
stats writer. "How can PROC CLUSTER be used in SAS, and can you provide an example?." PSYCHOLOGICAL SCALES, 2024. https://scales.arabpsychology.com/stats/how-can-proc-cluster-be-used-in-sas-and-can-you-provide-an-example/.
stats writer (2024) 'How can PROC CLUSTER be used in SAS, and can you provide an example?', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/how-can-proc-cluster-be-used-in-sas-and-can-you-provide-an-example/.
[1] stats writer, "How can PROC CLUSTER be used in SAS, and can you provide an example?," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, June, 2024.
stats writer. How can PROC CLUSTER be used in SAS, and can you provide an example?. PSYCHOLOGICAL SCALES. 2024;vol(issue):pages.
