TUTORIAL

FUZZYMATCH SYNTAX

=FUZZYMATCH(string1, string2) 

Use the FUZZYMATCH function to compare strings. When executed, it returns the percentage similarity between the two strings, displayed in decimal form. Below is an illustration showing what it looks like:

Facebook icon
Twitter icon
Calculating percentage similarity

Calculating Percentage Similarity

  • string1 [Required]. The string to be compared to string2.
  • string2 [Required]. The string to be compared to string1.

NOTE:

    1. The two strings can be interchanged without affecting the result.
    2. Single letter words are ignored.
    3. Numbers are treated as strings.

FLOOKUP SYNTAX

=FLOOKUP(lookupValue, tableArray, lookupCol, indexNum, [threshold], [rank])

Use the FLOOKUP function to search the any column of a range of cells, and then return a value from a specified column of that range. For example, suppose you have a list of employee names contained in the range A2:A5 and in another range B2:D5. You can choose to arrange the data in the manner shown in the following illustration:

Left lookup

Left Lookup

If you know the employee's name, you can use the FLOOKUP function to return either the Department or the Employee ID. For example, to obtain the Employee ID for “Erez Mordecai”, you can use the formula =FLOOKUP(A2, B2:D5, 1, 3). This formula searches for the value found in cell A2 in the first column of the range B2:D5, and then returns the value that is contained in the third column of that range and on the same row as the lookup value.

To obtain the Employee ID for “Erez Mordecai”, and there’s a chance that the name might be spelled differently in the database (e.g. “Erez E. Modecai”), you can use the arguments below:

=FLOOKUP(A2, B2:D5, 1, 3, 0.5)

This formula searches for the value found in cell A2 in the first column of the range B2:D5 but, unlike the first example, this time the algorithm searches for a 0.5 similarity minimum. This parameter is adjustable and it is one of the features that make Flookup versatile.

Finally, suppose that the range you want to search is not the first column of the database, as shown in the illustration below:

Two-way lookup

Two-way Lookup

To find out the department “Erez Mordecai” works in, use the following arguments:

=FLOOKUP(A2, B2:D5, 2, 1)

[Breakdown: Search for A2 in the second column of B2:D5 and return data from the first column]

To find out his Employee ID, use the following arguments:

=FLOOKUP(A2, B2:D5, 2, 3)

[Breakdown: Search for A2 in the second column of B2:D5 and return data from the third column]

NOTE: You can use the rank parameter to return other possible matches, as long as the percentage similarity between those matches and the lookup value is not less than the threshold value.

FLOOKUP Parameters

  • LookupValue [Required]. The value to search in the any column of the table or range. The lookupValue argument can be a value or a reference.
  • TableArray [Required]. The range of cells (database) that contains the data. The values in the user-specified column of tableArray are systematically compared to lookupValue. These values can be text, numbers, or other values.
  • LookupCol [Required]. The column number in tableArray which must be searched by the algorithm. A lookupCol value of 1 searches the first column in tableArray, a lookupCol value of 2 searches second column in tableArray, and so on. Values must be greater than 0.
  • IndexNum [Required]. The column number in the tableArray argument from which the matching value must be returned. An indexNum argument of 1 returns the value in the first column in tableArray, an indexNum of 2 returns the value in the second column in tableArray, and so on. Values must be greater than 0.
  • Threshold [Optional]. A variable that specifies whether you want Flookup to find an exact match or an approximate match. It is the similarity level below which results WILL NOT be returned from tableArray. Values range from 0 (lowest possible match) to 1 (exact match). The default value is 0.6.
  • Rank [Optional]. A variable that specifies the rank, of all the possible matches, that you want returned from tableArray. Values range from 1 (first possible match) to n (the nth possible match). The default value is 1 and results are ranked in order of descending percentage similarity.

PLEASE SUPPORT US BY MAKING A DONATION TODAY

Tip: To help FLOOKUP work faster, start by setting threshold to 1. After getting all possible exact matches, reduce threshold for the new (and hopefully smaller) dataset to sort out the more difficult matches.

Facebook icon
Twitter icon

HFLOOKUP SYNTAX

=HFLOOKUP(lookupValue, tableArray, lookupRow, indexNum, [threshold], [rank])

Use the HFLOOKUP function to search the any row of a range of cells, and then return a value from a specified row of that range. It works exactly like a transposed FLOOKUP function.

HFLOOKUP Parameters

  • LookupValue [Required]. The value to search in the any row of the table or range. The lookupValue argument can be a value or a reference.
  • TableArray [Required]. The range of cells that you want to retrieve data from. The values in a user-specified row of tableArray are systematically compared to lookupValue. These values can be text, numbers, or other values.
  • LookupRow [Required]. The row number in tableArray which must be searched by the algorithm. A lookupRow value of 1 searches the first row in tableArray, a lookupRow value of 2 searches second row in tableArray, and so on. Values must be greater than 0.
  • IndexNum [Required]. The row number in the tableArray argument from which the matching value must be returned. An indexNum argument of 1 returns the value in the first row in tableArray, an indexNum of 2 returns the value in the second row in tableArray, and so on. Values must be greater than 0.
  • Threshold [Optional]. A variable that specifies whether you want HFLOOKUP to find an exact match or an approximate match. It is the similarity level below which results WILL NOT be returned from tableArray. Values range from 0 (lowest possible match) to 1 (exact match). The default value is 0.6.
  • Rank [Optional]. A variable that specifies the rank, of all the possible matches, that you want returned from tableArray. Values range from 1 (first possible match) to n (the nth possible match). The default value is 1 and results are ranked in order of descending percentage similarity.

Tip: To help HFLOOKUP work faster, start by setting threshold to 1. After getting all possible exact matches, reduce threshold for the new (and hopefully smaller) dataset to sort out the more difficult matches.

SOUNDMATCH SYNTAX

=SOUNDMATCH(lookupValue, tableArray, lookupCol, indexNum, [rank])

Use the SOUNDMATCH function to perform lookup tasks and return values based on what their Refined Soundex codes are.

Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. It encodes similar sounding words so that they can be matched regardless of differences in spelling. You can find out more about it here.

SOUNDMATCH also allows you to return alternative matches using the rank parameter. These matches range from 1 (best match) to n (the nth best match).

Sound similarity lookup

Sound Similarity Lookup

In the above illustration, Soundex found two strings that had matches [TRUE] and two strings that had no matches [FALSE]. The column titled “Lookup Results” shows the matches SOUNDMATCH returned from column B, for each item in column A. If no match is found, it will return "N/A".

SOUNDMATCH Parameters

  • LookupValue [Required]. The value to search in the any column of the table or range. The lookupValue argument can be a value or a reference.
  • TableArray [Required]. The range of cells (database) that contains the data. You can use a reference to a range. The values in the first column of tableArray are the values searched by lookupValue. These values can be text, numbers, or other values.
  • LookupCol [Required]. The column number in tableArray which must be searched by the algorithm. A lookupCol value of 1 searches the first column in tableArray, a lookupCol value of 2 searches second column in tableArray, and so on. Values must be greater than 0.
  • IndexNum [Required]. The column number in the tableArray argument from which the matching value must be returned. An indexNum argument of 1 returns the value in the first column in tableArray, an indexNum of 2 returns the value in the second column in tableArray, and so on. Values must be greater than 0.
  • Rank [Optional]. A variable that specifies the rank, of all the possible matches, that you want returned from tableArray. Values range from 1 (first possible match) to n (the nth possible match). The default value is 1 and results are ranked alphabetically (ascending order).

PLEASE SUPPORT US BY MAKING A DONATION TODAY

LATINIZE SYNTAX

=LATINIZE(str)

Use the LATINIZE function to remove diacritical marks from strings before applying FUZZYMATCH or any of the lookup functions to them.

The str parameter can either be:

  • A single cell e.g. LATINIZE(A1) or,
  • A range of cells e.g. LATINIZE(A1:A20)

HIGHLIGHTING ROWS WITH DUPLICATE VALUES

Flookup can be used to find and highlight duplicates from a single column. The By percentage feature is powered by the FUZZYMATCH algorithm and the By sound feature is powered by Refined Soundex.

You can find duplicate values either by percentage similarity or by sound similarity. To find duplicate values by percentage similarity, use the menu item labelled "By percentage" [Flookup > Find duplicates > By percentage] and to find duplicate values by sound similarity, use the menu item labelled "By sound" [Flookup > Find duplicates > By sound].

Facebook icon
Twitter icon
Highlighting cells with duplicate values

Highlighting Cells with Duplicate Values

By Percentage

With this function, Flookup finds duplicate values which have a specific minimum percentage similarity between them. The steps to highlight duplicate values by this method are:

  1. Select a range.
  2. Find and click the menu item labelled "By percentage".
  3. A prompt is displayed requesting the user to enter the column index to analyse and the desired minimum percentage similarity. If no user input is made, values which have a 100% similarity match between them will be found and the first column the selected range will be analysed.

All selected values with the specified percentage similarity between them will be highlighted yellow.

By Sound

With this function, Flookup finds and highlights duplicate values based on sound similarity (as pronounced in English). The steps to highlight duplicate values by this method are:

  1. Select a range.
  2. Find and click the menu item labelled "By sound".
  3. A prompt is displayed requesting the user to enter the column index to analyse. If no user input is made, the first column the selected range will be analysed.

All selected values with similar sounds will be highlighted aqua.

PLEASE SUPPORT US BY MAKING A DONATION TODAY

DELETING ROWS WITH DUPLICATE VALUES

Flookup can be used to delete rows with duplicate values based on a single column. The By percentage feature is also powered by the FUZZYMATCH algorithm, whereas the By sound feature is powered by Refined Soundex.

You can delete rows either by percentage similarity or by sound similarity. To delete rows by percentage similarity, use the menu item labelled "By percentage" [Flookup > Delete duplicates > By percentage] and to delete rows by sound similarity, use the menu item labelled "By sound" [Flookup > Delete duplicates > By sound].

Deleting rows with duplicate values

Deleting Rows with Duplicate Values

By Percentage

With this function, Flookup finds and deletes rows containing duplicate values which have a specific minimum percentage similarity between them. The steps to delete rows by this method are:

  1. Select values in a range.
  2. Find and click the menu item labelled "By percentage".
  3. A prompt is displayed requesting the user to enter the column index to analyse and the desired minimum percentage similarity. If no user input is made, values which have a 100% similarity match between them will be found and the first column the selected range will be analysed.

All selected rows containing duplicate values, with the specified percentage similarity between them, will be deleted.

By Sound

With this function, Flookup finds and deletes rows containing duplicate values based on sound similarity (as pronounced in English). The steps to delete rows by this method are:

  1. Select a range.
  2. Find and click the menu item labelled "By sound".
  3. A prompt is displayed requesting the user to enter the column index to analyse. If no user input is made, the first column the selected range will be analysed.

All selected rows containing values with similar sounds will be deleted.