How do I extract a string before a space in R?

In R, the “strsplit” function can be used to extract a string before a space. This function splits a string into pieces based on a specified separator (in our case, a space) and returns a vector of the substrings. Additionally, the “substr” function can be used to extract a substring from a character vector by specifying a start and end index. Both of these functions can be used to extract a string before a space in R.


You can use the following methods to extract a string before a whitespace in R:

Method 1: Extract String Before Space Using Base R

gsub( " .*$", "", my_string)

Method 2: Extract String Before Space Using stringr Package

library(stringr)

word(my_string, 1)

Both of these examples extract the string before the first space in the string called my_string.

The following examples show how to use each method in practice with the following data frame:

#create data frame
df <- data.frame(athlete=c('A', 'B', 'C', 'D'),
                 distance=c('23.2 miles', '14 miles', '5 miles', '9.3 miles'))

#view data frame
df

  athlete   distance
1       A 23.2 miles
2       B   14 miles
3       C    5 miles
4       D  9.3 miles

Example 1: Extract String Before Space Using Base R

The following code shows how to extract the string before the space in each string in the distance column of the data frame:

#create new column that extracts string before space in distance column
df$distance_amount <- gsub( " .*$", "", df$distance) 

#view updated data frame
df

  athlete   distance distance_amount
1       A 23.2 miles            23.2
2       B   14 miles              14
3       C    5 miles               5
4       D  9.3 miles             9.3

Notice that the new column called distance_amount contains the string before the space in the strings in the distance column of the data frame.

Related:

Example 2: Extract String Before Space Using stringr Package

The following code shows how to extract the string before the space in each string in the distance column of the data frame by using the word() function from the stringr package in R:

library(stringr)

#create new column that extracts string before space in distance column
df$distance_amount <- word(df$distance, 1)

#view updated data frame
df

  athlete   distance distance_amount
1       A 23.2 miles            23.2
2       B   14 miles              14
3       C    5 miles               5
4       D  9.3 miles             9.3

Notice that the new column called distance_amount contains the string before the space in the strings in the distance column of the data frame.

Note that the word() function from the stringr package extracts words from a given string.

By supply the value 1 to this function, we’re able to extract the first word found in a string which is the equivalent of extracting the string before the first space.

The following tutorials explain how to perform other common tasks in R:

x