When you’re working with raw data in Excel, it’s often a pain to have to check and remove any duplicate rows of data in a table. Of course, you can do this manually by sorting your table and deleting the duplicate rows one by one.
Luckily, Excel offers a few easy ways to do this without this manual pain - which can save you hours if you’re doing this for a large data set.
Using the remove duplicates tool
If you simply want to remove your duplicates, and it doesn’t matter to you which of the duplicates you remove, the built in ‘Remove Duplicates’ tool is a really easy way to get it done fast.
To use it:
1. First select your data table, either by selecting the columns, or from top left to bottom right of the table.
2. Go to the Data menu and click ‘Remove Duplicates’
3. In the menu that pops up, choose if your data selection has headers, and ensure that the only column checked is the column in which the duplicates are.
4. Click OK, and those duplicate rows will be deleted.
Choosing which row to keep
It may be that you need to determine which of your duplicate rows should be kept, if for example there is another column which contains a different value in each of the two rows, and you need to keep one of theses
When the Duplicate Removal Tool runs, the first row of any duplicate will be kept. So to keep a particular row, simply sort the data before you run the duplicate tool so that the row you wish to keep of any duplicates appears above the others.
Highlighting duplicate rows
Sometimes it’s useful not to delete the duplicate rows, but simply to be able to highlight them. This is quite simple using functions - and there’s a couple of ways to do it.
Add a new column and add a COUNTIF function.