# Flookup Documentation

The fuzzy matching algorithms are processor intensive because each value of interest must be compared to every value in the lookup table. To speed up the process, try the following:

Reduce the size of the data set by filtering it or removing stop words with NORMALIZE.

Enter arrays (e.g. A1:A1000) as the lookupValue value in order to significantly improve speed and efficiency. It also prevents the spreadsheet from hanging.

Set the threshold value to 1 in order to eliminate the "low-hanging fruit" during your first run.

## NORMALIZE

=NORMALIZE(lookupValue, stopArray)

Use the NORMALIZE function to remove diacritical marks from strings before applying FUZZYMATCH or any of the lookup functions.

The **lookupValue **parameter can either be:

A single cell e.g. NORMALIZE(A1) or,

A range of cells e.g. NORMALIZE(A1:A30000).

You can also use the following function to remove stop words. This is useful if you have words that you would like Flookup to ignore during processing. To do this, you need to feed the **stopArray** parameter with an array of words you want to ignore as shown below:

NORMALIZE(A1:A30000, B1:B500)

The second argument is a range containing the stop words (i.e. B1:B500). Please note that it is optional and, therefore, if you do not include it, the function will default to removing diacritical marks. Also note that, in this mode, NORMALIZE will return a new list of words in lowercase form, without modifying the original list.

## FUZZYMATCH

=FUZZYMATCH(string1, string2)

Use the** **FUZZYMATCH function to calculate the percentage similarity between strings and return the result in decimal form.

### FUZZYMATCH Parameters

**string1**[Required]. The string to be compared to string2.**string2**[Required]. The string to be compared to string1.

Here are a few things to note about this function:

The two arguments can be interchanged without affecting the result.

Numbers are treated as strings.

Both parameters can also be ranges e.g. =FUZZYMATCH(A1:A20000,B1:B20000) OR =FUZZYMATCH(A1,B1:B20000).

FUZZYMATCH supports multiple languages.

## FLOOKUP

=FLOOKUP(lookupValue, tableArray, lookupCol, indexNum, [threshold], [rank])

Use the FLOOKUP function to search any column of a range of cells, and then return a value from a specified column of that range. For example, suppose you have two separate lists of names contained in ranges **A2:A3000** and** B2:D5000**. In order to compare the text in both these columns you can simply type:

=FLOOKUP(A2, B2:D5000, 1, 3)

This formula searches for the value found in cell** A2 **in the first column of the range** B2:D5000**, and then** **returns the first match contained in the third column of that same range.

In order to retrieve a value and there is a chance that it might be spelled differently in **B2:D5000**, one can type:

=FLOOKUP(A2, B2:D5000, 1, 3, 0.5)

This formula searches for the value found in** **cell** A2 **in the first column** **of the range **B2:D5000** but, unlike the first example, this time the algorithm matches the first string with a 0.5** **percentage similarity minimum. This parameter is adjustable and it takes any value from 0 to 1.

Finally, unlike other solutions, Flookup is not limited to searching the left most column of a specified range. For example, one can simply search for **A2** in the second column of **B2:D5000** and return data from the first column by using the formula below**:**

=FLOOKUP(A2, B2:D5000, 2, 1)

... or search for **A2** in the second column of **B2:D5000** and return data from the third column like this:

=FLOOKUP(A2, B2:D5000, 2, 3)

### PRO TIP

For increased efficiency you can write your formula as shown below:

=FLOOKUP(A2:A1000, B2:D1000, 2, 3)

This variation takes an array (range of many cells) as the *lookupValue*, thereby improving performance.

### FLOOKUP Parameters

**lookupValue**[Required]. The value to search for in any column**tableArray**[Required]. The range of cells (database) that contains the data. The values in the user-specified column of tableArray are systematically compared to lookupValue. These values can be text, numbers, or other values.**lookupCol**[Required]. The column number in tableArray which must be searched by the algorithm. A lookupCol value of 1 searches the first column in tableArray, a lookupCol value of 2 searches second column in tableArray, and so on. Values must be greater than 0.**indexNum**[Required]. The column number in the tableArray argument from which the matching value must be returned. An indexNum argument of 1 returns the value in the first column in tableArray, an indexNum of 2 returns the value in the second column in tableArray, and so on. Values must be greater than 0.**threshold**[Optional]. A variable that specifies whether you want**rank**[Optional]. A variable that specifies the rank of all the possible matches, that you want returned from tableArray. Values range from 1 (first possible match) to*n*(the*nth*possible match). The default value is 1 and results are ranked in order of descending percentage similarity.

**NOTE:** FLOOKUP returns the lookup result and its respective percentage similarity. Please allow 2 columns for this to work.

## HFLOOKUP

=HFLOOKUP(lookupValue, tableArray, lookupRow, indexNum, [threshold], [rank])

Use the HFLOOKUP function to search any row of a range of cells, and then return a value from a specified row of that range. It works exactly like a transposed FLOOKUP function.

### HFLOOKUP Parameters

**lookupValue**[Required]. The value to search for in any row**tableArray**[Required]. The range of cells that you want to retrieve data from. The values in a user-specified row of tableArray are systematically compared to lookupValue. These values can be text, numbers, or other values.**lookupRow**[Required]. The row number in tableArray which must be searched by the algorithm. A lookupRow value of 1 searches the first row in tableArray, a lookupRow value of 2 searches second row in tableArray, and so on. Values must be greater than 0.**indexNum**[Required]. The row number in the tableArray argument from which the matching value must be returned. An indexNum argument of 1 returns the value in the first row in tableArray, an indexNum of 2 returns the value in the second row in tableArray, and so on. Values must be greater than 0.**threshold**[Optional]. A variable that specifies whether you want**rank**[Optional]. A variable that specifies the rank of all the possible matches, that you want returned from tableArray. Values range from 1 (first possible match) to*n*(the*nth*possible match). The default value is 1 and results are ranked in order of descending percentage similarity.

## SOUNDMATCH

=SOUNDMATCH(lookupValue, tableArray, lookupCol, indexNum, [rank])

Use the SOUNDMATCH function to perform lookup tasks and return values based on what their Soundex codes are.

Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. It encodes similar sounding words so that they can be matched regardless of differences in spelling. You can find out more about it **here**.

SOUNDMATCH also allows you to return alternative matches using the rank parameter. These matches range from 1 (best match) to *n* (the *nth* best match).

### SOUNDMATCH Parameters

**lookupValue**[Required]. The value to search in the column of the table or range. The**tableArray**[Required]. The range of cells (database) that contains the data. You can use a reference to a range. The values in the first column of tableArray are the values searched by lookupValue. These values can be text, numbers, or other values.**lookupCol**[Required]. The column number in tableArray which must be searched by the algorithm. A lookupCol value of 1 searches the first column in tableArray, a lookupCol value of 2 searches second column in tableArray, and so on. Values must be greater than 0.**indexNum**[Required]. The column number in the tableArray argument from which the matching value must be returned. An indexNum argument of 1 returns the value in the first column in tableArray, an indexNum of 2 returns the value in the second column in tableArray, and so on. Values must be greater than 0.**rank**[Optional]. A variable that specifies the rank of all the possible matches, that you want returned from tableArray. Values range from 1 (first possible match) to*n*(the*nth*possible match). The default value is 1 and

## SUMLOOKUP

=SUMLOOKUP(lookupValue, colArray, lookupCol, [threshold])

SUMLOOKUP is the fuzzy match version of **SUMIF**. Use the SUMLOOKUP find the sum of values of a given column based on corresponding partial matches in another column.

For example, suppose you have a list of employee names containing typos or spelling variations in range **A2:A5000** and their respective salaries spanning several months contained in** B2:B5000**. In order to calculate the total remuneration due to and employee called "Michael Angel", you can type:

=SUMLOOKUP("Michael Angel", A2:A5000, B2:B5000)

Now let's suppose that his name is contained in cell **C2**. To get this sum, you can type:

=SUMLOOKUP(C2, A2:A5000, B2:B5000, 0.8)

You can also use SUMLOOKUP to merge duplicates into a single row using the same principles shown above.

### SUMLOOKUP Parameters

**lookupValue**[Required]. The value to search for in the specified column. The**colArray**[Required]. The column to be searched for possible matches.**lookupCol**[Required]. The column number from which values will be either merged (for text) or summed (for numbers).**threshold**[Optional]. The minimum level of percentage similarity below which matches in colArray will be ignored. The default value is 0.6.

## MEANLOOKUP

=MEANLOOKUP(lookupValue, colArray, lookupCol, [threshold])

MEANLOOKUP is the fuzzy match version of **AVERAGEIF**. Use the MEANLOOKUP find the average of values of a given column based on corresponding partial matches in another column.

For example, suppose you have a list of student names containing typos or spelling variations in range **A2:A5000** and test scores spanning several months contained in** B2:B5000**. In order to calculate the average test score of a student named "Gabriel Angel", you can type:

=MEANLOOKUP("Gabriel Angel", A2:A5000, B2:B5000)

Now let's suppose that his name is contained in cell **C2**. To get this average, you can type:

=MEANLOOKUP(C2, A2:A5000, B2:B5000, 0.8)

### MEANLOOKUP Parameters

**lookupValue**[Required]. The value to search for in the specified column. The**colArray**[Required]. The column to be searched for possible matches.**lookupCol**[Required]. The column number from which values will be averaged.**threshold**[Optional]. The minimum level of percentage similarity below which matches in colArray will be ignored. The default value is 0.6.

## ULIST

=ULIST(colArray, [indexNum], [threshold])

Use ULIST to remove duplicates and return unique values from a range that you have specified. This function does not modify the original range or values and they can either be exact matches or not.

### ULIST Parameters

**colArray**[Required]. The range from which you want to return unique values.**indexNum**[Optional]. The column index to analyse for unique values. The default value is 1.**threshold**[Optional]. The minimum percentage similarity between the colArray values that are not unique. Therefore a threshold value of 0.6 means that ULIST will look for partial matches and then eliminate any values with a 60 percent similarity and above. The default value is 1.

## Highlight Duplicate Values

Flookup can be used to highlight duplicates in Google Sheets based on partial matches from a single column.

To highlight duplicate values by percentage similarity, go to *Add-ons > Flookup > Highlight duplicates > By percentage* and to highlight duplicate values by sound similarity, go to *Add-ons > Flookup > Highlight duplicates > By sound*.

### By Percentage

With this function, Flookup highlights duplicate values which have a specific minimum percentage similarity between them. The steps to highlight duplicates by this method are:

Select a range.

Find and click the menu item labelled "By percentage".

A prompt is displayed requesting the user to enter the column index to analyse and the desired minimum percentage similarity. If no user input is made, values which have a 100% similarity match between them will be found and the first column of the selected range will be analysed.

All selected values with the specified percentage similarity between them will be highlighted yellow.

### By Sound

With this function, Flookup highlights duplicate values based on sound similarity between the them. The steps to highlight duplicates by this method are:

Select a range.

Find and click the menu item labelled "By sound".

A prompt is displayed requesting the user to enter the column index to analyse. If no user input is made, the first column of the selected range will be analysed.

All selected values with similar sounds will be highlighted aqua.

## Remove Duplicate Values

Flookup can be used to remove duplicates in Google Sheets based on partial matches from a single column.

To remove duplicates by percentage similarity, go to *Add-ons > Flookup > Remove duplicates > By percentage* and to remove duplicates by sound similarity, go to *Add-ons > Flookup > Remove duplicates > By sound*.

### By Percentage

With this function, Flookup finds and deletes selected rows that contain duplicate values with a specific minimum percentage similarity between them. The steps to remove duplicates by this method are:

Select values in a range.

Find and click the menu item labelled "By percentage".

A prompt is displayed requesting the user to enter the column index to analyse and the desired minimum percentage similarity. If no user input is made, values which have a hundred percent similarity match between them will be found and the first column of the selected range will be analysed.

All selected rows containing duplicate values, with the specified percentage similarity between them, will be deleted.

### By Sound

With this function, Flookup deletes selected rows that contain duplicate values based on the sound similarity between them. The steps to remove duplicates by this method are:

Select a range.

Find and click the menu item labelled "By sound".

A prompt is displayed requesting the user to enter the column index to analyse. If no user input is made, the first column of the selected range will be analysed.

All selected rows containing values with similar sounds will be deleted.