Table of Contents
str_match() is an R function from the stringr package that can be used to extract matching patterns of a character string. This function takes a string and a regular expression as arguments and returns the matches in a matrix. The first row of the matrix contains the names of the matched strings and the following rows contain the matches. Examples of its application include extracting the date from a string or extracting all the words from a sentence.
The str_match() function from the package in R can be used to extract matched groups from a string.
This function uses the following syntax:
str_match(string, pattern)
where:
- string: Character vector
- pattern: Pattern to look for
The following examples show how to use this function in practice
Example 1: Use str_match with Vector
The following code shows how to use the str_match() function to extract matched patterns from a character vector:
library(stringr) #create vector of strings x <- c('Mavs', 'Cavs', 'Heat', 'Thunder', 'Blazers') #extract strings that contain 'avs' str_match(x, pattern='avs') [,1] [1,] "avs" [2,] "avs" [3,] NA [4,] NA [5,] NA
The result is a matrix in which each row displays the matched pattern or an NA value if the pattern was not found.
For example:
- The pattern ‘avs’ was found in the first element ‘Mavs’, so ‘avs’ was returned.
- The pattern ‘avs’ was found in the second element ‘Cavs’, so ‘avs’ was returned.
- The pattern ‘avs was not found in the third element ‘Heat’ so NA was returned.
And so on.
Example 2: Use str_match with Data Frame
Suppose we have the following data frame in R:
#create data frame
df <- data.frame(team=c('Mavs', 'Cavs', 'Heat', 'Thunder', 'Blazers'),
points=c(99, 104, 110, 103, 115))
#view data frame
df
team points
1 Mavs 99
2 Cavs 104
3 Heat 110
4 Thunder 103
5 Blazers 115
The following code shows how to use the str_match() function to add a new column to the data frame that either does or does not contain a matched pattern for each team name:
library(stringr)
#create new column
df$match <- str_match(df$team, pattern='avs')
#view updated data frame
df
team points match
1 Mavs 99 avs
2 Cavs 104 avs
3 Heat 110 <NA>
4 Thunder 103 <NA>
5 Blazers 115 <NA>
The new column titled match contains either the pattern ‘avs’ or NA, depending on whether the pattern is found in the team column.
The following tutorials explain how to perform other common tasks in R: