Before You Really Start

Prepare Your Data

Fuzzy matching algorithms are processor intensive because, in most cases, each value of interest must be compared to every value in the lookup table. To speed up the process and improve your overall success rate, please try the following:

  • Use NORMALIZE to remove those unwanted words or punctuation marks that negatively affect the accuracy of the results, and to reduce the workload.

  • Use the FUZZYMATCH function to get a feel of the scale Flookup uses to grade percentage similarities between text entries. All our lookup functions, except SOUNDMATCH, depend on this function.

  • Enter ranges (e.g. A1:A1000) as the lookupValue value in order to significantly improve speed and efficiency. Doing this also prevents the spreadsheet from hanging.

  • Set the threshold value to 1 in order to eliminate the "low-hanging fruit" during your first run. You can gradually lower it for your subsequent runs.

  • Take advantage of the Long Run Mode. Custom functions are given exactly 30 seconds to return results. However, in Long Run Mode (LRM), they can run for 6 minutes (for regular Google accounts) or 30 minutes (for Google Workspace accounts) before timing out. To access this feature, click on the menu items located under Flookup > Long Run Mode.

These are a few simple tips to help you use Flookup better and we encourage you to apply them, if possible.

NORMALIZE

=NORMALIZE(lookupValue, [stopArray])

Use the NORMALIZE function to remove diacritical marks from text entries before applying FUZZYMATCH or any of the lookup functions. We highly recommend that you begin by using this function before running any further analysis on your data.

The lookupValue parameter can either be:

    • A single cell e.g. =NORMALIZE(A1) or,

    • A range of cells e.g. =NORMALIZE(A1:A3000).

You can also use the following function to remove unwanted words or punctuation marks. This is useful if you have words or punctuation marks that you would like Flookup to ignore during processing. To do this, you need to feed the stopArray parameter with an array of words or punctuations you want to ignore as shown below:

  • =NORMALIZE(A1:A3000, B1:B10)

  • =NORMALIZE(A1:A3000, "company, https, ltd, limited, org")

The second argument can be a range containing the unwanted words or punctuation marks (e.g. B1:B10) or a list of directly typed words or punctuation marks.

Note

  • NORMALIZE will create a new list without modifying the original list. It is therefore advised that you use this function in an empty column or sheet.

  • The second parameter is optional and, therefore, if you do not include it in your formula, the function will default to removing diacritical marks.

  • NORMALIZE will remove most common punction marks automatically but if any were missed, they can be removed based on a user input pattern like this: =NORMALIZE(A1:A3000, "-, &").

FUZZYMATCH

=FUZZYMATCH(string1, string2)

Use the FUZZYMATCH function to calculate the percentage similarity between text entries and return the result in decimal form.

FUZZYMATCH Parameters

  1. string1 [Required]. The primary text entry to be compared to string2.

  2. string2 [Required]. The secondary text entry to be compared to string1.

Here are a few things to note about this function:

  • The two arguments can be interchanged without affecting the result.

  • Both parameters can also be ranges e.g. =FUZZYMATCH(A1:A2000, B1:B2000) OR =FUZZYMATCH(A1, B1:B2000). The first formula compares a cell in one column, to a cell in a second column which is in the same row and the second formula compares a single cell in one column to all the cells in a second column.

  • FUZZYMATCH supports multiple languages.

Using the Long Run Mode

  1. Lookup range: Select the lookup range and click "Get selected range".

  2. Table to search: Select the table of one column or more you want to analyse and then click "Get selected range".

  3. Target column: Indicate the column in "Table to search", with values that you want to compare to values from "Lookup range".

  4. Click an empty cell where you want your results to be displayed.

  5. Click "Calculate percentage similarities".

Note

  • When the FUZZYMATCH LRM function times out, the results that have been processed up to that point will be displayed. Keep an eye on the tooltip popups to track progress.

  • This mode will compare cells on the same row to one another e.g. A1 to B1, C2 to D2 and so on.

The Lookup range cannot be longer than Table to search.

Further Reading

Please follow any of the links below to read more of our documentation: