REFERENCE GUIDE TO CUSTOM FUNCTIONS
Introduction to Custom Functions
Our custom functions offer a robust suite of tools for cleaning and standardising data directly within Google Sheets. These functions allow you to perform tasks such as normalising text entries, calculating similarity scores, conducting fuzzy lookups, matching data based on phonetic similarity and extracting unique values. Each function is designed to be intuitive and user-friendly, requiring only straightforward formulas to execute powerful data transformations.
NORMALIZE
=NORMALIZE(lookup_value, stop_array, threshold, "operation")
Use the NORMALIZE function to remove unwanted words, diacritics and punctuation marks from text entries. This improves the data cleaning process by removing parts of text entries or whole text entries that could negatively affect the accuracy of the fuzzy matching process. Moreover, with less data to process, the speed of the data cleaning process could be greatly improved.
NORMALIZE Parameters
- lookup_value: The text or range to be normalised.
- stop_array: Words to remove (optional).
- threshold: Similarity threshold (optional).
- operation: Type of normalisation (optional).
FUZZYSIM
=FUZZYSIM(left_string, right_string)
Use the FUZZYSIM function to calculate the percentage similarity between text entries and return the result in decimal form. You can compare single cells or ranges to one another, e.g. A1 vs B1, A1 vs B1:B100 or A1:A100 vs B1:B100, but left_string should never hold more rows of data than right_string.
FUZZYSIM Parameters
- left_string: The first text or range.
- right_string: The second text or range.
FLOOKUP
=FLOOKUP(lookup_value, table_array, lookup_col, index_num, threshold, merge_col)
Use the FLOOKUP function to search any column and then return a value from a specified column of that range, paired with its respective percentage similarity. For example, it can take values in range A1:A1000, compare it to values in C1:E5000 and return values from a single column like D1:D5000, based on the matches that are found.
FLOOKUP Parameters
- lookup_value: The value to search for.
- table_array: The range to search in.
- lookup_col: The column to match against.
- index_num: The column number to return a value from.
- threshold: Similarity threshold (optional).
- merge_col: Column to merge results (optional).
QLOOKUP
=QLOOKUP(lookup_value, table_array, lookup_col, index_num, threshold, "operation")
The QLOOKUP function is a variant of FLOOKUP, returning only the lookup result without the relative percentage similarity. The operation parameter determines the matching algorithm: by_set (for subtle differences) or by_union (for speed).
HFLOOKUP
=HFLOOKUP(lookup_value, table_array, lookup_row, index_num, threshold)
This function works exactly like a transposed FLOOKUP. It is designed for horizontally oriented datasets and only returns the match without its relative percentage similarity.
SOUNDMATCH
=SOUNDMATCH(lookup_value, table_array, lookup_col, index_num)
Use the SOUNDMATCH function to lookup values in any column and return values based on a refined version of the Soundex algorithm.
SOUNDMATCH Parameters
- lookup_value: The value to search for.
- table_array: The range to search in.
- lookup_col: The column to match against.
- index_num: The column number to return a value from.
ULIST
=ULIST(col_array, index_num, threshold, "operation")
You can use the ULIST function to return unique values from any list of text entries based on their relative percentage or sound similarities. If you like the UNIQUE function in Google Sheets, you will love this function.
ULIST Parameters
- col_array: The column or range to process.
- index_num: The column number to return a value from.
- threshold: Similarity threshold (optional).
- operation: Matching algorithm (optional).
Possible Error Messages
- !NOTICE: No access for your current plan. You have not activated the right plan for the function you are attempting to use.
- !ERROR: Invalid index value. You have entered an index number that is less than 1 or greater than the number of columns available in the search table.
- !NOTICE: No match found. There are no matches available for the current parameters you have set.
- !ERROR: Invalid threshold value. You have entered a figure that is less than 0 or greater than 1.
- !ERROR: Invalid operation. You are running a function that supports different operations but have entered a wrong or unrecognised operation.
Enhanced Menu Functions
To access Menu functions, navigate to Extensions > Flookup Data Wrangler in your spreadsheet menu. These functions offer a 6-minute timeout, significantly longer than the 30-second limit of custom functions, allowing you to clean much more data and faster. They also support advanced data cleaning operations tailored to your needs. Follow the links below: