HOW TO HIGHLIGHT DUPLICATES IN GOOGLE SHEETS

Introduction to Highlighting Duplicates

One of the most important features Google Sheets provides, is the ability to highlight duplicates.

Using the highlight function can help us profile our data visually; whether it is by exposing double work and therefore resources wastage or by identifying text entries that might appear to be duplicates and yet they are not. 

Let us take a brief look at how to use default relevant functions provided by Google Sheets and Flookup respectively.

The images we have used below are strictly for illustration purposes only.

An illustration showing the process of highlighting of duplicates in Google Sheets.

Whereas highlighting your data is a very good way to understand your data before performing irreversible operations like removing duplicates, unfortunately, at the time of writing this article, Google Sheets does not offer a predefined way of doing the same.

However, with custom formulas and conditional formatting, highlighting duplicates in your spreadsheet can be done in a just few clicks.

Highlighting Duplicates Using Conditional Formatting

A small spreadsheet section showing a list of fictional characters and their respective emails in Google Sheets.

This method of highlighting duplicates assumes that your data is relatively clean i.e. there are very few or no text-based differences.

In your Google Sheets spreadsheet, select the cells where you want to find duplicates. This can be a range with any number of rows or columns.

A section of the conditional format window in Google Sheets.

In your spreadsheet menu, head to Format > Conditional formatting. This will open a sidebar where you can enter your highlighting rules.

A section showing format rules for conditional formatting in Google Sheets.

In the side bar, you will see two tabs, “Single colour” and “Colour scale” and the former is selected by default.

First, ensure that the range you want to highlight is selected and correctly showing in the Apply to range field. If this is not the case, simply click the Range Picker icon at the right end of this field to select a new range.

Click the dropdown menu located under the "Format rules", scroll down to an item labelled “Custom formula is” and click it. The sidebar will then update to look like this:

Choosing a formatting style section in Google Sheets.

In the field with the “Value or formula” placeholder, enter this formula:

=COUNTIF($B$2:$B$10,B2) > 1

This is a COUNTIF formula with arguments specific to the range you selected. It will run over range B2:B10 and note if any item there occurs more than once. You can choose to adjust these arguments to suit any other range you might want to select in subsequent operations.

At this point you can choose the formatting style you would like to use for your results using this section.

Spreadsheet values highlighted using conditional formatting in Google Sheets.

When you click the button labelled “Done”, your results will look like the example shown here.

Highlighting Duplicates Using Flookup

Google’s efforts above are admirable, but sometimes you need to bring out the big guns when faced with problematic datasets [as real-world datasets often tend to be].

It is a well-established fact that nobody wants to make Bruce Banner angry and, to be completely honest, we really do not want any angry emails to come flying back at us from anyone else on our mailing list below.

Names with different spellings in a spreadsheet.

As you can see from this image, Barney's name is listed twice with a subtle spelling difference. Let us see how to go about highlighting cases that follow this pattern.

A sidebar that is used for highlighting duplicates by percentage similarity using Flookup.

Head to Extensions > Flookup Data Wrangler > Highlight duplicates > By percentage in your spreadsheet menu. This will open a sidebar that looks like this.

A spreadsheet section showing two columns of text entries, with one column of entries selected.
A small spreadsheet section of two columns showing two duplicates that have been highlighted by percentage similarity using Flookup.

The result will be duplicates that have been identified despite differences in spelling as shown here. Please note that you can also choose to highlight duplicates by sound similarity by clicking Extensions > Flookup Data Wrangler > Highlight duplicates > By sound and following steps similar to the ones discussed in this section.