How to Convert Categorical Variables to Numeric in R

In R, categorical variables can be converted to numeric variables using the as.numeric() function. This function takes a vector of categorical variables and returns a vector of numeric variables, assigning a numeric value to each category. This is useful for statistical analysis and machine learning as some algorithms require the data to be in numerical form.


You can use one of the following methods to convert a categorical variable to a numeric variable in R:

Method 1: Convert One Categorical Variable to Numeric

df$var1 <- unclass(df$var1)

Method 2: Convert Multiple Categorical Variables to Numeric

df[, c('var1', 'var2')] <- sapply(df[, c('var1', 'var2')], unclass)

Method 3: Convert All Categorical Variables to Numeric

df[sapply(df, is.factor)] <- data.matrix(df[sapply(df, is.factor)])

The following examples show how to use each method with the following data frame:

#create data frame with some categorical variables
df <- data.frame(team=as.factor(c('A', 'B', 'C', 'D')),
                 conf=as.factor(c('AL', 'AL', 'NL', 'NL')),
                 win=as.factor(c('Yes', 'No', 'No', 'Yes')),
                 points=c(122, 98, 106, 115))

#view data frame
df

  team conf win points
1    A   AL Yes    122
2    B   AL  No     98
3    C   NL  No    106
4    D   NL Yes    115

Method 1: Convert One Categorical Variable to Numeric

The following code shows how to convert one categorical variable in a data frame to a numeric variable:

#convert 'team' variable to numeric
df$team <- unclass(df$team)

#view updated data frame
df

  team conf win points
1    1   AL Yes    122
2    2   AL  No     98
3    3   NL  No    106
4    4   NL Yes    115

Notice that the values for the ‘team’ variable have been converted to numeric values.

Method 2: Convert Multiple Categorical Variables to Numeric

The following code shows how to convert multiple categorical variables in a data frame to numeric variables:

#convert 'team' and 'win' variables to numeric
df[, c('team', 'win')] <- sapply(df[, c('team', 'win')], unclass)

#view updated data frame
df

  team conf win points
1    1   AL   2    122
2    2   AL   1     98
3    3   NL   1    106
4    4   NL   2    115

Notice that the values for the ‘team’  and ‘win’ variables have been converted to numeric values.

Method 3: Convert All Categorical Variables to Numeric

The following code shows how to convert all categorical variables in a data frame to numeric variables:

#convert all categorical variables to numeric
df[sapply(df, is.factor)] <- data.matrix(df[sapply(df, is.factor)])

#view updated data frame
df

  team conf win points
1    1    1   2    122
2    2    1   1     98
3    3    2   1    106
4    4    2   2    115

Notice that the values for each of the categorical variables in the data frame have been converted to numeric values.

The following tutorials explain how to perform other common conversions in R:

x