What is Flookup?
Flookup is a fuzzy matching and lookup add-on for Google Sheets.
It can be used to:
- Calculate the percentage similarity between strings.
- Fuzzy lookup or search data sets.
- Highlight duplicate values.
- Remove duplicate values.
- Extract a list of unique values.
Advantages of Using Flookup
- It has a familiar, intuitive and easy-to-understand syntax.
- It is powered by a battle-tested algorithm with millions of rows of "dirty" data under its belt.
- It is completely flexible. You can search any column to the left or right of your target column with
FLOOKUPand you can search any row above or below your target row with
- It is reliable. Flookup is built to compare text based on percentage similarities that are predictable and consistent with most estimates.
- It is fast. Flookup runs one of the fastest fuzzy matching algorithms in the world and all its functions are optimised to complete tasks in the shortest time possible.
- It is convenient. Working out of the G Suite platform is full of benefits for individuals and organisations alike... and it is backed by Google.
- It is dynamic. If the first result doesn't suit your needs, you can instruct Flookup to return the next best match until all possible matches have been exhausted.
- It has range. You can combine any number of strings as the
lookupValue, giving you options to increase the specificity of your query.
- It offers value for money. We currently follow a PWYW model so you get to pay what you think Flookup is worth.
- It is private and secure. Flookup neither stores nor shares any of the data it processes with third parties, meaning that your data is safe from snooping. Secondly, the Flookup code is hosted in a secure environment which can only be accessed through the use of multi-factor authentication, thereby preventing any unauthorised code manipulation.
We are currently offering Flookup for free, supported by donations from users like you. Please keep checking this page for any changes to that status.
For now, you can visit our donation page to find out how to support us.
Getting Started with Flookup
- Install Flookup from the G Suite Marketplace.
- Once installed, access the functions by either typing
=ULISTwithin any cell of your Google spreadsheet. You can also highlight or delete selected rows containing duplicate values, by accessing the respective functions through the add-on menu.
- Visit the tutorial page to learn how to use the add-on.
- Visit the changelog to keep up-to-date with important changes to Flookup.
Popular Fuzzy Matching Algorithms
Fuzzy matching (also called approximate string matching) is a technique for comparing strings that might have a less than 100% match. There are different techniques that are applied by fuzzy matching algorithms and the most popular involve the use of wildcard characters, word or phrase comparisons, regular expressions and edit distance. Examples include:
- Levenshtein Distance: This algorithm calculates and returns the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into another.
- Damerau–Levenshtein Distance: This algorithm is exactly like Levenshtein distance with one exception; it includes transpositions amongst its edits.
- Jaro–Winkler Distance: The Jaro distance between two words is the minimum number of single-character transpositions required to change one word into the other.
- n-gram: This is a contiguous sequence of n items from a given sequence of text or speech. The items can be phonemes, syllables, letters, words or base pairs according to the application.
- Soundex: This is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
- The Brain: "Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteers be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe."
Flookup uses modified versions of n-gram and refined Soundex to analyse text, depending on the function being used.