The SUBSTR Function in SAS (With Examples) How to use the SUBSTR function in SAS?

The SUBSTR function in SAS is used to extract a portion of a string or character value. It takes three arguments: the string or character value, the starting position of the substring, and the length of the substring. The syntax is SUBSTR (string, start, length) and it returns the specified portion of the string as a new string. Examples of this function can be found in the SAS documentation.


You can use the SUBSTR function in SAS to extract a portion of a string.

This function uses the following basic syntax:

SUBSTR(Source, Position, N)

where:

  • Source: The string to analyze
  • Position: The starting position to read
  • N: The number of characters to read

Here are the four most common ways to use this function:

Method 1: Extract First N Characters from String

data new_data;
    set original_data;
    first_four = substr(string_variable, 1, 4);
run;

Method 2: Extract Characters in Specific Position Range from String

data new_data;
    set original_data;
    two_through_five = substr(string_variable, 2, 4);
run;

Method 3: Extract Last N Characters from String

data new_data;
    set original_data;
    last_three = substr(string_variable, length(string_variable)-2, 3);
run;

Method 4: Create New Variable if Characters Exist in String

data new_data;
    set original_data;
    if substr(string_variable, 1, 4) = 'some_string' then new_var = 'Yes';
    else new_var = 'No';
run;

The following examples show how to use each method with the following dataset in SAS:

/*create dataset*/
data original_data;
    input team $1-10;
    datalines;
Warriors
Wizards
Rockets
Celtics
Thunder
;
run;

/*view dataset*/
proc print data=original_data;

Example 1: Extract First N Characters from String

The following code shows how to extract the first 4 characters from the team variable:

/*create new dataset*/
data new_data;
    set original_data;
    first_four = substr(team, 1, 4);
run;

/*view new dataset*/
proc print data=new_data;

Notice that the first_four variable contains the first four characters of the team variable.

Example 2: Extract Characters in Specific Position Range from String

The following code shows how to extract the characters in positions 2 through 5 from the team variable:

/*create new dataset*/
data new_data;
    set original_data;
    two_through_five = substr(team, 2, 4);
run;

/*view new dataset*/
proc print data=new_data;

Example 3: Extract Last N Characters from String

The following code shows how to extract the last 3 characters from the team variable:

/*create new dataset*/
data new_data;
    set original_data;
    last_three = substr(team, length(team)-2, 3);
run;

/*view new dataset*/
proc print data=new_data;

Example 4: Create New Variable if Characters Exist in String

The following code shows how to create a new variable called W_Team that takes a value of ‘yes‘ if the first character in the team name is ‘W’ or a value of ‘no‘ if the first characters is not a ‘W.’

/*create new dataset*/
data new_data;
    set original_data;
    if substr(team, 1, 1) = 'W' then W_Team = 'Yes';
    else W_Team = 'No';
run;

/*view new dataset*/
proc print data=new_data;

The following tutorials explain how to perform other common tasks in SAS:

x