TUTORIAL

In this tutorial we are going to look at a few examples of how to use Flookup for Google Sheets. Flookup runs on a battle-tested algorithm that makes it the perfect solution for your data quality needs.

TIP: Easily import your data by using the API Connector from the G Suite Marketplace.

HOW TO FUZZY MATCH IN GOOGLE SHEETS

SUMMARY OF STEPS

=FLOOKUP(lookupValue, tableArray, lookupCol, indexNum, threshold, [rank])
    • Type a range or type a value as the lookupValue.
    • Select the tableArray.
    • Enter the lookupCol which is the column in the tableArray to analyse.
    • Enter indexNum, which is the column in tableArray from which results should be returned.
    • Press Return or Enter.

One of the most popular functions of Flookup is the eponymous FLOOKUP and it can be used to to find matches or merge data. In this example, we have two tables (shown in the image below): one contains a list of individuals and their respective monthly salaries, and another contains list of individuals only:

Flookup tutorial #GetFlookup

We can use FLOOKUP to match names in the list titled "Work List" to the names in the list titled "Full Name", in order to return their respective salaries. By writing the formula like this =FLOOKUP(D2,$A$2:$B$15,1,2), with the default percentage similarity of 0.6, we get the following results:

Flookup tutorial #GetFlookup

As you can see, even at a 0.6 level of similarity, some matches are not caught. In order to fix this, we simply adjust the threshold parameter downwards until all possible matches are found. Highlighted in yellow are the results we get at a 0.5 similarity level:

Flookup tutorial #GetFlookup

And these are the results at a 0.3 similarity level, at which point all possible matches have been found. The matches highlighted aqua are found at the lowest threshold of the batch:

Flookup tutorial #GetFlookup

NOTE: The initial parameter, the lookupValue, can either be a single string or concatenated text. For example, suppose we have extra data in column F e.g. cities lived in, we can rewrite the function we already used like this: =FLOOKUP(D2&F2,$A$2:$B$15,1,2). This function combines the values in D2 and F2 into one string. This is especially useful for distinguishing similar looking strings that are not duplicates.

And that concludes our brief tutorial on how to fuzzy match with Flookup. Please note that the principles demonstrated here also hold true for the HFLOOKUP function.

HOW TO REMOVE DUPLICATES IN GOOGLE SHEETS

SUMMARY OF STEPS

=ULIST(colArray, [indexNum], [threshold])
    • Type or select a range as the colArray.
    • (Optional) Enter indexNum, which is the column in colArray to analyse for unique values.
    • (Optional) Adjust the threshold, the percentage above which values are considered duplicates and removed.

Duplicates in data can be a constant headache for every professional. This article by HubSpot highlights the enormous costs and issues that organisations face because of not dealing with duplicates and poor data quality in general. Luckily, with Flookup, removing duplicates from your data is a simple and intuitive process.

Let's take a look at an example showing how to improve data quality by removing duplicates from it. In this example, we will be focusing on a function called ULIST.

Suppose that you have a list with names of individuals, the cities they live in and their respective salaries as shown below:

Flookup tutorial #GetFlookup

This list obviously contains duplicates under the column titled "city". In order to extract a list of individuals without the duplicated cities, we can use the ULIST formula as shown below:

Flookup tutorial #GetFlookup

And our results will look like this:

Flookup tutorial #GetFlookup

However, as we can see, there is a duplicate that has not been caught because it contains a spelling error; that duplicate is "New Yorke" vs "New York". In order to resolve this, we need to adjust the threshold parameter as follows:

Flookup tutorial #GetFlookup

And finally, we have a clean list with no duplicates:

Flookup tutorial #GetFlookup

If you are a fan of Google's UNIQUE function, then you will love the power ULIST brings to your spreadsheet.

HOW TO HIGHLIGHT DUPLICATES IN GOOGLE SHEETS

SUMMARY OF STEPS

    • Select a range.
    • Adjust the Column index, the column to analyse for duplicates.
    • Adjust the threshold, the percentage above which values are considered duplicates.
    • Click "Highlight duplicates".

Highlighting duplicates is one of the best ways to get an overview of the state of your data. Unfortunately, most of the data we work with contains partial matches, punctuation marks or are that variables which throw off traditional methods you might know of or use. However, with Flookup, you have the power to handle strings with such variations.

Flookup tutorial #GetFlookup

We can see, in the illustration above, that the column showing cities (column B) has potential duplicates. In order to highlight duplicate rows in the entire data set based on this column, we start by selecting our range of interest and then heading to Add-ons > Highlight duplicates > By percentage as shown below:

Flookup tutorial #GetFlookup

In the next window we make necessary adjustments, specifically the column we want to analyse for duplicates and the minimum level of percentage similarity that we desire. This window also displays the number of columns you have selected to help guide you with adjusting the "Column index":

Flookup tutorial #GetFlookup

After running the function, duplicates are highlighted in yellow:

Flookup tutorial #GetFlookup

We can also highlight duplicates by similarity in sound. To do this, we follow the same steps shown above, but this time we access the function located at Add-ons > Highlight duplicates > By sound:

Flookup tutorial #GetFlookup

A window, which also shows the number of columns selected, pops up prompting you to enter the column to analyse:

Flookup tutorial #GetFlookup

After running the function, all duplicates are highlighted in aqua:

Flookup tutorial #GetFlookup

That concludes our highlight tutorial. Please note that the same principles demonstrated here also apply for removing duplicates by deletion.

HOW TO CALCULATE PERCENTAGE SIMILARITY IN GOOGLE SHEETS

=FUZZYMATCH(string1, string2)

SUMMARY OF STEPS

    • Type any 2 strings or ranges as the first and second arguments, respectively.
    • Press Return or Enter.

Another highly popular feature of Flookup is the FUZZYMATCH function. You can use this function to calculate the percentage similarity between any two strings, especially strings that follow name-like patterns. FUZZYMATCH allows you to compare strings in three modes: string vs string, string vs list and list vs list. The last two modes cut down the time taken to calculate percentage similarities from minutes to mere seconds.

The following example shows how to use FUZZYMATCH in string vs string mode:

Flookup tutorial #GetFlookup

The formula shown in the image above works through one row at a time and yields the following results:

Flookup tutorial #GetFlookup

This formula is sufficient for most cases, but sometimes you need more power and efficiency. In such cases, especially if you are working with long lists, using FUZZYMATCH in the list vs list mode is the better option:

Flookup tutorial #GetFlookup

The above formula works through all the rows in one fell swoop and returns results exactly the same as the first mode. The key difference here is that, whereas the string vs string might take minutes to complete long lists, this list vs list mode completes the same work in seconds.

Flookup tutorial #GetFlookup

Finally, in some cases, you might simply want to compare one string to multiple strings. To accomplish this, we use the third mode of FUZZYMATCH, and that is string vs list.

Say you have just one string in column A and a list of strings in column B. In order to compare the two columns, we can use a formula like this:

    • =FUZZYMATCH(A1,B1:B15)

This formula will return an array of percentage similarities between the first and second arguments.

That concludes our text comparison tutorial. Be sure to check out the Documentation page for details about all Flookup functions.