In this tutorial we are going to look at a few examples of how to use Flookup to match or lookup fuzzy matches right inside Google Sheets. Flookup runs on a robust engine that powers through any text you might have, making it the perfect solution for your data quality needs.
HOW TO FUZZY MATCH IN GOOGLE SHEETS WITH FLOOKUP
One of the most popular functions of Flookup is the eponymous FLOOKUP and you can use it to find matches or merge data. In this example, we have two tables (shown in the image below): one contains a list of individuals and their respective monthly salaries, and another contains list of individuals alone:
We can use FLOOKUP to match names in the list titled "Work List" to the names in the list titled "Full Name", in order to return their respective salaries. By writing the formula like this
=FLOOKUP(D2,$A$2:$B$15,1,2), and with the default percentage similarity of 0.6, we get the following results:
As you can see, even at a 0.6 level of similarity, some matches are not caught. In order to fix this, we simply adjust the threshold parameter downwards until all possible matches are found. Highlighted in yellow are the results we get at a 0.5 similarity level:
And these are the results at a 0.3 similarity level, at which point all possible matches have been found. The matches highlighted aqua are found at the lowest threshold of the batch:
NOTE: The initial parameter, the lookupValue, can either be a single string or concatenated text. For example, suppose we have extra data in column F e.g. cities lived in, we can rewrite the function we already used like this:
=FLOOKUP(D2&F2,$A$2:$B$15,1,2). This function combines the values in D2 and F2 into one string. This is especially useful for distinguishing similar looking strings that are not duplicates.
And that concludes our brief tutorial on how to fuzzy match with Flookup. Please note that the principles demonstrated here also hold true for the HFLOOKUP function.
HOW TO REMOVE DUPLICATES IN GOOGLE SHEETS WITH ULIST
Duplicates in data can be a constant headache for every professional. This article by HubSpot highlights the enormous costs and issues that organisations face because of not dealing with duplicates and poor data quality in general. Luckily, with Flookup, removing duplicates for your data e.g. contacts, leads, mailing lists etc., is a simple and intuitive process.
Let's take a look at an example showing how to improve data quality by removing duplicates. Suppose that you have a list with names of individuals, the cities they live in and their respective salaries as shown below:
This list obviously contains duplicates under the column titled "city". In order to extract a list of individuals without the duplicated cities, we can use the ULIST formula as shown below:
And our results will look like this:
However, as we can see, there is a duplicate that has not been caught because it contains a spelling error; that duplicate is "New Yorke" vs "New York". In order to resolve this, we need to adjust the threshold parameter as follows:
And finally, we have a clean list with no duplicates:
If you are a fan of Google's UNIQUE function, you will love ULIST and the power it brings to your spreadsheet.
HOW TO HIGHLIGHT DUPLICATES IN GOOGLE SHEETS
Highlighting duplicates is one of the best ways to get an overview of the state of your data. Unfortunately, most of the data we work with contains partial matches, punctuation marks or are that variables which throw off traditional methods you might know of or use. However, with Flookup, you have the power to handle text differently.
We can see, in the illustration above, that the column showing cities (column B) has potential duplicates. In order to highlight duplicate rows in the entire data set based on this column, we start by selecting our range of interest and then heading to Add-ons > Highlight duplicates > By percentage as shown below:
In the next window we make necessary adjustments, specifically the column we want to analyse for duplicates and the minimum level of percentage similarity that we desire. This window also displays the number of columns you have selected to help guide you with adjusting the "Column index":
After running the function, duplicates are highlighted in yellow:
We can also highlight duplicates by similarity in sound. To do this, we follow the same steps shown above, but this time we access the function located at Add-ons > Highlight duplicates > By sound:
A window, which also shows the number of columns selected, pops up prompting you to enter the column to analyse. After running the function, all duplicates are highlighted in aqua:
That concludes our highlight tutorial. Please note that the same principles demonstrated here also apply for deleting duplicates with Flookup.