How can I find duplicates in MongoDB?

How can I find duplicates in MongoDB?

Finding duplicates in MongoDB can be achieved by using the aggregation framework. This involves grouping the data by a specific field and then counting the number of documents in each group. If the count is more than one, it indicates the presence of duplicates. Additionally, the $group and $match operators can be used to further refine the search and narrow down the specific field in which the duplicates are present. Another method is to use the $distinct operator, which returns a list of unique values for a given field. By comparing this list with the original data, any duplicates can be identified. Overall, the use of aggregation and specific operators in MongoDB can effectively help in finding and managing duplicates in a database.

Find Duplicates in MongoDB


You can use the following syntax to find documents with duplicate values in MongoDB:

db.collection.aggregate([
    {"$group" : { "_id": "$field1", "count": { "$sum": 1 } } },
    {"$match": {"_id" :{ "$ne" : null } , "count" : {"$gt": 1} } }, 
    {"$project": {"name" : "$_id", "_id" : 0} }
])

Here’s what this syntax does:

  • Group all documents having the same value in field1
  • Match the groups that have more than one document
  • Project all groups that have more than one document

This particular query finds duplicate values in the field1 column. Simply change this value to change the field to look in.

The following example shows how to use this syntax with a collection teams with the following documents:

db.teams.insertOne({team: "Mavs", position: "Guard", points: 31})db.teams.insertOne({team: "Mavs", position: "Guard", points: 22})db.teams.insertOne({team: "Rockets", position: "Center", points: 19})db.teams.insertOne({team: "Rockets", position: "Forward", points: 26})db.teams.insertOne({team: "Cavs", position: "Guard", points: 33})

Example: Find Documents with Duplicate Values

We can use the following code to find all of the duplicate values in the ‘team’ column:

db.teams.aggregate([
    {"$group" : { "_id": "$team", "count": { "$sum": 1 } } },
    {"$match": {"_id" :{ "$ne" : null } , "count" : {"$gt": 1} } }, 
    {"$project": {"name" : "$_id", "_id" : 0} }
])

This query returns the following results:

{ name: 'Rockets' }
{ name: 'Mavs' }

This tells us that the values ‘Rockets’ and ‘Mavs’ occur multiple times in the ‘team’ field.

Note that we can simply change $team to $position to instead search for duplicate values in the ‘position’ field:

db.teams.aggregate([
    {"$group" : { "_id": "$position", "count": { "$sum": 1 } } },
    {"$match": {"_id" :{ "$ne" : null } , "count" : {"$gt": 1} } }, 
    {"$project": {"name" : "$_id", "_id" : 0} }
])

This query returns the following results:

{ name: 'Guard' }

Additional Resources

The following tutorials explain how to perform other common operations in MongoDB:

Cite this article

stats writer (2024). How can I find duplicates in MongoDB?. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/how-can-i-find-duplicates-in-mongodb/

stats writer. "How can I find duplicates in MongoDB?." PSYCHOLOGICAL SCALES, 2 Jul. 2024, https://scales.arabpsychology.com/stats/how-can-i-find-duplicates-in-mongodb/.

stats writer. "How can I find duplicates in MongoDB?." PSYCHOLOGICAL SCALES, 2024. https://scales.arabpsychology.com/stats/how-can-i-find-duplicates-in-mongodb/.

stats writer (2024) 'How can I find duplicates in MongoDB?', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/how-can-i-find-duplicates-in-mongodb/.

[1] stats writer, "How can I find duplicates in MongoDB?," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, July, 2024.

stats writer. How can I find duplicates in MongoDB?. PSYCHOLOGICAL SCALES. 2024;vol(issue):pages.

Download Post (.PDF)
Slide Up
x
PDF
Scroll to Top