How to Use str_match in R (With Examples)?

str_match() is an R function from the stringr package that can be used to extract matching patterns of a character string. This function takes a string and a regular expression as arguments and returns the matches in a matrix. The first row of the matrix contains the names of the matched strings and the following rows contain the matches. Examples of its application include extracting the date from a string or extracting all the words from a sentence.


The str_match() function from the package in R can be used to extract matched groups from a string.

This function uses the following syntax:

str_match(string, pattern)

where:

  • string: Character vector
  • pattern: Pattern to look for

The following examples show how to use this function in practice

Example 1: Use str_match with Vector

The following code shows how to use the str_match() function to extract matched patterns from a character vector:

library(stringr)

#create vector of strings
x <- c('Mavs', 'Cavs', 'Heat', 'Thunder', 'Blazers')

#extract strings that contain 'avs'
str_match(x, pattern='avs')

     [,1] 
[1,] "avs"
[2,] "avs"
[3,] NA   
[4,] NA   
[5,] NA  

The result is a matrix in which each row displays the matched pattern or an NA value if the pattern was not found.

For example:

  • The pattern ‘avs’ was found in the first element ‘Mavs’, so ‘avs’ was returned.
  • The pattern ‘avs’ was found in the second element ‘Cavs’, so ‘avs’ was returned.
  • The pattern ‘avs was not found in the third element ‘Heat’ so NA was returned.

And so on.

Example 2: Use str_match with Data Frame

Suppose we have the following data frame in R:

#create data frame
df <- data.frame(team=c('Mavs', 'Cavs', 'Heat', 'Thunder', 'Blazers'),
                 points=c(99, 104, 110, 103, 115))

#view data frame
df

     team points
1    Mavs     99
2    Cavs    104
3    Heat    110
4 Thunder    103
5 Blazers    115

The following code shows how to use the str_match() function to add a new column to the data frame that either does or does not contain a matched pattern for each team name:

library(stringr)

#create new column
df$match <- str_match(df$team, pattern='avs')

#view updated data frame
df

     team points match
1    Mavs     99   avs
2    Cavs    104   avs
3    Heat    110  <NA>
4 Thunder    103  <NA>
5 Blazers    115  <NA>

The new column titled match contains either the pattern ‘avs’ or NA, depending on whether the pattern is found in the team column.

The following tutorials explain how to perform other common tasks in R:

x