How can we find unique rows across multiple columns using the R programming language?

How can we find unique rows across multiple columns using the R programming language?

The process of finding unique rows across multiple columns in the R programming language involves identifying and extracting distinct combinations of values from specified columns in a data set. This can be achieved by using the “unique” function, which identifies and keeps only the distinct rows from the specified columns. This method allows for efficient data manipulation and analysis, as it eliminates duplicate rows and provides a clear representation of unique data points. In summary, utilizing the “unique” function in R allows for the identification and extraction of unique rows across multiple columns, providing valuable insights and streamlining data analysis.

R: Find Unique Rows Across Multiple Columns


You can use the following methods to find unique rows across multiple columns of a data frame in R:

Method 1: Find Unique Rows Across Multiple Columns (Drop Other Columns)

df_unique <- unique(df[c('col1', 'col2')])

Method 2: Find Unique Rows Across Multiple Columns (Keep Other Columns)

df_unique <- df[!duplicated(df[c('col1', 'col2')]),]

The following examples show how to use each of these methods in practice with the following data frame:

#create data frame
df <- data.frame(conf=c('East', 'East', 'East', 'West', 'West', 'West'),
                 pos=c('G', 'G', 'F', 'G', 'F', 'F'),
                 points=c(33, 28, 31, 39, 34, 40))

#view data frame
df

  conf pos points
1 East   G     33
2 East   G     28
3 East   F     31
4 West   G     39
5 West   F     34
6 West   F     40

Method 1: Find Unique Rows Across Multiple Columns (Drop Other Columns)

The following code shows how to find unique rows across the conf and pos columns in the data frame:

#find unique rows across conf and pos columns
df_unique <- unique(df[c('conf', 'pos')])

#view results
df_unique 

  conf pos
1 East   G
3 East   F
4 West   G
5 West   F

The result is four rows that are all unique.

Also notice that the points column was automatically dropped from the results.

Method 2: Find Unique Rows Across Multiple Columns (Drop Other Columns)

The following code shows how to find unique rows across the conf and pos columns in the data frame and keep the values in the points column:

#find unique rows across conf and pos columns
df_unique <- df[!duplicated(df[c('conf', 'pos')]),]

#view results
df_unique 

  conf pos points
1 East   G     33
3 East   F     31
4 West   G     39
5 West   F     34

Notice that only unique rows exist across the conf and pos columns and the values in the points column are kept.

It’s important to note that only the value for the first unique occurrence is kept.

Similarly, there were two rows that contained “West” and “F” across the first two columns, but only the points value (34) for the first occurrence of this unique combination was kept in the final data frame.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

Cite this article

stats writer (2024). How can we find unique rows across multiple columns using the R programming language?. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/how-can-we-find-unique-rows-across-multiple-columns-using-the-r-programming-language/

stats writer. "How can we find unique rows across multiple columns using the R programming language?." PSYCHOLOGICAL SCALES, 28 Jun. 2024, https://scales.arabpsychology.com/stats/how-can-we-find-unique-rows-across-multiple-columns-using-the-r-programming-language/.

stats writer. "How can we find unique rows across multiple columns using the R programming language?." PSYCHOLOGICAL SCALES, 2024. https://scales.arabpsychology.com/stats/how-can-we-find-unique-rows-across-multiple-columns-using-the-r-programming-language/.

stats writer (2024) 'How can we find unique rows across multiple columns using the R programming language?', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/how-can-we-find-unique-rows-across-multiple-columns-using-the-r-programming-language/.

[1] stats writer, "How can we find unique rows across multiple columns using the R programming language?," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, June, 2024.

stats writer. How can we find unique rows across multiple columns using the R programming language?. PSYCHOLOGICAL SCALES. 2024;vol(issue):pages.

Download Post (.PDF)
Slide Up
x
PDF
Scroll to Top