In this tutorial we are going to look at a few examples of how to use Flookup to match or lookup fuzzy matches right inside Google Sheets. Flookup runs on a robust engine that powers through any text you might throw at it, making it the perfect solution for your data quality needs.
HOW TO FUZZY MATCH IN GOOGLE SHEETS WITH FLOOKUP
One of the most popular functions of Flookup is the eponymous FLOOKUP and you can use it to find matches or merge data. The image below shows a table with names of individuals and their respective monthly salaries:
We are going to use FLOOKUP to match names in the list titled "Work List" to the names in the list titled "Full Name", in order to return their respective salaries as shown below:
We use this function
=FLOOKUP(D2,$A$2:$B$15,1,2), with the default percentage similarity of 0.6, and get these results:
As you can see, even at a 0.6 level of percentage similarity, some results are not caught. In order to fix this, we simply adjust the threshold parameter downwards until all possible matches are found. Highlighted in yellow are the results we get at a 0.5 percentage similarity level:
And these are the results at a 0.3 percentage similarity level, at which point all possible matches have been found:
The initial value, the lookupValue, can either be a single string or concatenated text.
For example, supposed we have extra data in column F e.g. cities lived in, we can rewrite the function we already used like this:
=FLOOKUP(D2&F2,$A$2:$B$15,1,2). This function combines the values in D2 and F2 into one string. This is especially useful when you are presented with similar looking strings that are not duplicates.
And that concludes our brief tutorial on how to fuzzy match with Flookup. Please note that the principles demonstrated here also hold true for the HFLOOKUP function.
HOW TO REMOVE DUPLICATES IN GOOGLE SHEETS WITH ULIST
Duplicates in data can be a constant headache for every professional. This article by HubSpot highlights the enormous costs and list of issues that organisations face because of not dealing with duplicates and poor data quality in general. Luckily, with Flookup, removing duplicates for your data e.g. contacts, leads, mailing lists etc., is a simple and intuitive process.
Let's see an example of how to improve data quality by removing duplicates. Suppose that you have a list with names of individuals, the cities they live in and their respective salaries as shown below:
This list obviously contains duplicates under the second column showing cities. In order to extract a list of individuals without the duplicated cities, we can use the ULIST formula as shown below:
And our results will look like this:
However, as we can see, there is a duplicate that's not caught because it is a typo. That duplicate is "New Yorke" vs "New York". In order to resolve this, we need to adjust the threshold parameter as follows:
And finally, we have a clean list with no duplicates:
If you are a fan of Google's UNIQUE function, you will sure love ULIST and the power it brings to your spreadsheet.
HOW TO HIGHLIGHT DUPLICATES IN GOOGLE SHEETS
Highlighting duplicates is one of the best ways to get an overview of the state of your data. Unfortunately, most of the data we work with contains partial matches, punctuation marks or are that variables which significantly throw off traditional methods you might know of or use. However, with Flookup, we handle text a little differently.
We can see that, in the illustration above, that the column showing cities (column B) has potential duplicates. In order to highlight duplicate rows in the entire data set based on this column, we start by selecting our range of interest and then heading to Add-ons > Highlight duplicates > By percentage as shown below:
In the next window we make necessary adjustments, specifically the column we want to analyse for duplicates and the minimum level of percentage similarity that we desire. This window also displays the number of columns you have selected to help guide you with adjusting the "Column index":
After running the function, duplicates are highlighted in yellow:
We can also highlight duplicates by similarity in sound. To do this, we follow the same steps shown above, but this time we access the function located at Add-ons > Highlight duplicates > By sound:
A window, which also shows the number of columns selected, pops up prompting you to enter the column to analyse. After running the function, all duplicates are highlighted in aqua:
That concludes our highlight tutorial. Please note that the same principles demonstrated here also apply for deleting duplicates with Flookup.