How do I Perform a Log Transformation in SAS?

A log transformation in SAS can be performed using the TRANSFORM option in PROC UNIVARIATE and specifying the LOG option. This can be used to normalize skewed data and may provide more accurate results when used in certain statistical tests. Additionally, PROC TRANSREG can be used to log transform certain columns in a data set.

Many statistical tests make the assumption that the values for a particular variable are .

However, often values are not normally distributed. One way to address this issue is to transform the variable by taking the log of each value.

By performing this transformation, a variable typically becomes closer to normally distributed.

The following example shows how to perform a log transformation on a variable in SAS.

Example: Log Transformation in SAS

Suppose we have the following dataset in SAS:

/*create dataset*/
data my_data;
    input x;
    datalines;
1
1
1
2
2
2
2
2
2
3
3
3
6
7
8
;
run;

/*view dataset*/
proc print data=my_data;

We can use to perform normality tests on the variable x to determine if it is normally distributed and also create a histogram to visualize the distribution of values:

/*create histogram and perform normality tests*/
proc univariate data=my_data normal; 
    histogram x;
run;

From the last table titled Tests for Normality we can see that the for the Shapiro-Wilk test is less than .05, which provides strong evidence that the variable x is not normally distributed.

The histogram also shows that the distribution of values does not appear to be normally distributed:

We can attempt a log transformation on the original dataset to see if we can produce a dataset that is more normally distributed.

We can use the following code to create a new dataset in SAS in which we take the log of each of the original x values:

/*use log transformation to create new dataset*/
data log_data;
    set my_data;
    x = log(x);
run;

/*view log transformed data*/
proc print data=log_data;

/*create histogram and perform normality tests*/
proc univariate data=log_data normal; 
    histogram x;
run;

From the last table titled Tests for Normality we can see that the for the Shapiro-Wilk test is now greater than .05.

The histogram also shows that the distribution of values is slightly more normally distributed than it was before the transformation:

Based on the results of the Shapiro-Wilk test and the histogram shown above, we would conclude that the log transformation created a variable that is much more normally distributed than the original variable.

The following tutorials explain how to perform other common tasks in SAS:

x