Brief Introduction

One of the most important features Google Sheets has, is the ability to highlight duplicates.

Using the highlight function can help us profile our data easily; whether it is by preventing double work and therefore resources wastage or by identifying text entries that might appear to be duplicates and yet they are not. 

Let us take a brief look at how to use default functions provided in Google Sheets and Flookup.

The images we have used are for illustration purposes only.

An illustration showing the process of highlighting of duplicates in Google Sheets.

Highlighting your data is a very good way to get an overview of what your data looks like before performing destructive operations like removing duplicates or using that data for any other purpose. Unfortunately, at the time of writing this article, Google Sheets does not provide a predefined way of highlighting duplicates. However, with custom formulas and conditional formatting, highlighting duplicates in your spreadsheet can be done in a few clicks.

Highlighting Duplicates Using Conditional Formatting

This method of highlighting duplicates assumes that your data is relatively clean i.e. there are very few or no text-based differences.

In your Google Sheets spreadsheet, select the cells where you want to find duplicates. This can be a range with any number of rows or columns.

A small spreadsheet section showing a list of fictional characters and their respective emails in Google Sheets.

In your spreadsheet menu, head to Format > Conditional formatting. This will open a sidebar where you can enter your highlighting rules.

A section of the conditional format window in Google Sheets.

In the side bar, you will see two tabs, “Single colour” and “Colour scale” and the former is selected by default.

First, ensure that the range you want to highlight is selected and correctly showing in the Apply to range field. If this is not the case, simply click the Range Picker icon at the right end of this field to select a new range.

Click the dropdown menu located under the "Format rules", scroll down to an item labelled “Custom formula is” and click it. The sidebar will then update to look like this:

A section showing format rules for conditional formatting in Google Sheets.

In the field with the “Value or formula” placeholder, enter this formula:

=COUNTIF($B$2:$B$10,B2) > 1

This is a COUNTIF formula with arguments specific to the range you selected. It will run over range B2:B10 and note if any item there occurs more than once. You can choose to adjust these arguments to suit any other range you might want to select in subsequent operations. Please note that, as you adjust the ranges in this formula, Google Sheets will automatically select the relevant fields for you.

At this point you can choose the formatting style you would like to use for your results using this section:

Choosing a formatting style section in Google Sheets.

… and, after clicking “Done”, your results will look like this:

Spreadsheet values highlighted using conditional formatting in Google Sheets.

Highlighting Duplicates Using Flookup

Google’s efforts above are admirable, but sometimes you need to bring out the big guns when faced with problematic datasets [as real-world data often tends to be].

It is a well-established fact that nobody wants to make Bruce Banner angry and, to be completely honest, we really don’t want any angry emails to come flying back at us from anyone else on our list above… and one such person is old Barney.

As you can see from the screenshot below, Barneys name is listed twice with a subtle spelling difference. Let us see how to go about highlighting cases that look like this.

Names with different spellings in a spreadsheet.

Head to Extensions > Flookup Data Wrangler > Highlight Duplicates > By Percentage in your spreadsheet menu. This will open a sidebar that looks like this:

A sidebar that is used for highlighting duplicates by percentage similarity using Flookup.

Select the client names in the manner shown below:

A spreadsheet section showing two columns of text entries, with one column of entries selected.

After that, follow these steps to highlight the selected data:

A small spreadsheet section of two columns showing two duplicates that have been highlighted by percentage similarity using Flookup.

The result will be duplicates that have been identified despite differences in spelling. Please note that you can also choose to highlight duplicates by sound similarity by clicking Extensions > Flookup Data Wrangler > Highlight Duplicates > By Sound and following steps similar to the ones above.