How can I use the nchar() function in R?

The nchar() function in R is used to count the number of characters in a string of text. It takes a single argument, which is a character string, and returns an integer representing the number of characters present in the string. It is commonly used when working with strings and text to determine the length of the string, or to check if the length is within certain limits.


The nchar() function in R can be used to count the length of characters in a string object.

This function uses the following basic syntax:

nchar(x, keepNA = NA)

where:

  • x: Name of the string object
  • keepNA: Default is to return ‘NA’ if NA is encountered. If set to TRUE, a value of 2 is returned to represent the length of ‘NA’ as a string.

The following examples show how to use this function in practice.

Example 1: Use nchar() to Count Length of Characters

Suppose we have the following data frame in R:

#create data frame
df <- data.frame(player=c('J Kidd', 'Kobe Bryant', 'Paul A. Pierce', 'Steve Nash'),
                 points=c(22, 34, 30, 17))

#view data frame
df

          player points
1         J Kidd     22
2    Kobe Bryant     34
3 Paul A. Pierce     30
4     Steve Nash     17

The following code shows how to use the nchar() function to count the length of each string in the player column:

#create new column that counts length of characters in player column
df$player_length <- nchar(df$player)

#view updated data frame
df

          player points player_length
1         J Kidd     22             6
2    Kobe Bryant     34            11
3 Paul A. Pierce     30            14
4     Steve Nash     17            10

The new column called player_length contains the length of each string in the player column.

Note that the nchar() function counts spaces and special characters as well.

For example, in the name ‘Paul A. Pierce’ the nchar() function counts the two spaces and the period along with all of the letters to get a total length of 14.

Example 2: Use nchar() with NA Values

Suppose we have the following data frame in R:

#create data frame
df <- data.frame(player=c(NA, 'Kobe Bryant', 'Paul A. Pierce', 'Steve Nash'),
                 points=c(22, 34, 30, 17))

#view data frame
df

          player points
1           <NA>     22
2    Kobe Bryant     34
3 Paul A. Pierce     30
4     Steve Nash     17

#create new column that counts length of characters in player column
df$player_length <- nchar(df$player)

#view updated data frame
df

          player points player_length
1           <NA>     22            NA
2    Kobe Bryant     34            11
3 Paul A. Pierce     30            14
4     Steve Nash     17            10

However, if we use the argument keepNA=FALSE then a value of 2 will be returned for each string that is equal to NA:

#create new column that counts length of characters in player column
df$player_length <- nchar(df$player, keepNA=FALSE)

#view updated data frame
df

          player points player_length
1           <NA>     22             2
2    Kobe Bryant     34            11
3 Paul A. Pierce     30            14
4     Steve Nash     17            10

Notice that a value of 2 is returned for the first player since this represents the length of ‘NA’ as a string.

The following tutorials explain how to perform other common tasks in R:

x