HOW TO REMOVE DUPLICATES

Introduction to Removing Duplicates

Flookup can be used to remove fuzzy duplicates in Google Sheets based on matches from a single column. To remove duplicates using Flookup, go to Extensions > Flookup > Remove duplicates and either click "By percentage" or "By sound".

How to Remove Duplicates By Percentage Similarity

    1. Click the menu item labelled "By percentage".

    2. Select text entries of one column or more in your spreadsheet.

    3. Click "Map columns in selection" in order to index the current columns in your selection.

    4. Enter the "Index One" value. If no user input is made, then the first column of the selected range will be analysed.

    5. Enter the "Threshold" value. If no user input is made, only exact matches will be deleted.

    6. Click "Remove duplicates".

How to Remove Duplicates By Sound Similarity

    1. Click the menu item labelled "By sound".

    2. Select text entries of one column or more in your spreadsheet.

    3. Click "Map columns in selection" in order to index the columns in your selection.

    4. Specify the "Index One" value. If no user input is made, the first column of the selected range will be analysed.

    5. Click "Remove duplicates".

Key Points A

  • When adjusting "Index One" or "Threshold", please use the arrow buttons on the extreme right of each input field.

  • The "Index One" value is the only column that will be analysed in this mode. This can be any integer representing a column inside your selection.

  • If you are running "By percentage", then duplicates will be values within "Index One" that have a level of similarity that is higher than or equal to the "Threshold" value.

  • If this function finishes running or times out, a message will be displayed indicating how many rows have been processed up to that point.

How To Remove Duplicates Across Two Different Columns

    1. Click the menu item labelled "By percentage" or "By sound".

    2. Select a range of more two columns or more in your spreadsheet.

    3. Select the drop-down option labelled Compare two different columns.

    4. Click "Map columns in selection" in order to index the columns in your selection.

    5. Specify your "Index One" and "Index Two". These are the two columns that will be compared to each other.

    6. If you are running "By percentage", adjust the "Threshold" value to match your needs.

    7. Click "Remove duplicates".

Key Points B

  • When adjusting "Index One", "Index Two" or "Threshold", please use the arrow buttons on the extreme right of each input field.

  • Duplicates are values in "Index One" that exist in "Index Two".

  • Duplicates have a level of similarity that is higher than or equal to "Threshold" between them.

  • If this function finishes running or times out, a message will be displayed indicating how many rows have been processed up to that point.

ULIST

=ULIST(colArray, [indexNum], [threshold])

Use ULIST to remove duplicates and return unique values from a range that you have specified. This function does not modify the original range or values.

ULIST Parameters

    • colArray [Required]. The range from which you want to return unique values.

    • indexNum [Optional]. The column index to analyse for unique values. The default value is 1.

    • threshold [Optional]. The minimum percentage similarity between the "colArray" values that are not unique. Therefore a "threshold" value of 0.6 means that ULIST will eliminate any values with a 60 percent similarity and above. The default value is 1.

Using Long Run Mode

  1. Head to Extensions > Flookup > Long Run Mode (LRM) > ULIST in your spreadsheet menu.

  2. Primary range: Select range of one or more columns and click "Grab selected range".

  3. Index One: Enter the index of the column of values in "Primary range" that you want to analyse. If no user input is made, then the leftmost column of "Primary range" will be analysed.

  4. Threshold: Enter the minimum percentage similarity. If no user input is made, then values that are exact matches will be marked as duplicates and removed.

  5. Click an empty cell in a column where you want your results to be displayed.

  6. Click "Get unique values".

Key Points

  • When adjusting "Index One" or "Threshold", please use the arrow buttons on the extreme right of each input field.

  • If ULIST LRM completes or times out, the results that have been processed up to that point, will be displayed.