FUZZY MATCHING AND LOOKUP FUNCTIONS FOR GOOGLE SHEETS
YOUR DATA QUALITY PROBLEM
If you deal with merging multiple sources of data, data migration, exporting and importing data, etc. then you have undoubtedly had data quality issues in your work. According to this infographic by IBM, poor data quality costs the United States economy about $3.1 trillion every year. That easily translates to an average of $100 per database record for businesses in comparative economies, U.S inclusive.
Data quality issues might arise when data is merged from different sources. This can create duplicate records that lower productivity, increase budget costs, lower customer satisfaction and reduce labour efficiency. You might also face data quality issues while attempting to compare data sets that contain typos, spelling variations, punctuation marks and other related complexities.
These are just some of the issues that you can avoid or resolve by adding Flookup to your software stack. Flookup will help you to increase productivity, reduce your error rate and save money.
Flookup is an advanced tool for performing fuzzy match and fuzzy lookup tasks in Google Sheets. With Flookup, you can perform the following actions regardless of variations in spelling or the existence of typos and partial matches in your data:
- Remove duplicates.
- Highlight duplicates.
- Find string matches.
- Calculate the percentage similarity between strings.
- Find the sum or average of numbers based on corresponding matches.
- Merge cells based on matches from a single column.
- Extract unique values from any column.
Flookup functions are not inhibited by the sort order of columns or rows, punctuation marks or the letter case of the text, thereby saving you from the time and hard work it takes to clean or normalize your data.
FLOOKUP IS ALREADY TRUSTED BY 39000+ INDIVIDUALS AND 70+ ORGANISATIONS ACROSS 90 COUNTRIES!
TEN REASONS WHY YOU SHOULD CHOOSE FLOOKUP
- It is simple. Flookup has familiar, intuitive and easy-to-use functions, making it easy to learn and to use.
- It is secure. Flookup is a verified add-on that does not expose, store or share any of the spreadsheet data it processes, meaning that your data remains private and for your eyes only.
- It is fast. Flookup runs on one of the fastest fuzzy matching algorithms in the world and all its functions are optimised to complete tasks in the shortest time possible.
- It is reliable. Flookup is powered by a battle-tested algorithm with millions of rows and columns of data under its belt.
- It is flexible. You can search any column with FLOOKUP and search any row with HFLOOKUP, not just the leftmost column or topmost row.
- It is dynamic. If the first match does not suit your needs, you can instruct Flookup to return the next best match until all possible matches have been exhausted.
- It has range. You can combine any number of strings as the lookupValue, giving you options to increase the specificity of your query.
- It speaks your language. Flookup is not limited to English; it can process text in most international languages.
- It offers value. Not only is Flookup free, but it also Flookup unlocks the most affordably priced premium experience when you complete your subscription.
- It is on the G Suite platform. Working out of the G Suite platform is full of benefits for individuals and organisations alike. Using Flookup taps directly into those benefits.
GETTING STARTED WITH FLOOKUP
- Install Flookup by clicking here.
- Activate premium features by submitting your email through the secure form in your spreadsheet menu located at: Add-ons > Flookup > Manage subscription. Please make sure that you are logged into the right account for authentication to succeed.
- Visit the Documentation to fully familiarise yourself with Flookup.
- Review the FAQ and Changelog in case you have any questions about Flookup.
NOTE: To view your subscription status, go to your spreadsheet menu and click: Add-ons > Flookup > About Flookup.
POPULAR FUZZY MATCHING ALGORITHMS
Fuzzy string matching (also called approximate or partial string matching) is a technique for comparing strings that might have a less than 100% match. There are different techniques that are applied by fuzzy matching algorithms and the most popular involve the use of wildcard characters (in combination with functions like VLOOKUP or INDEX/MATCH), word or phrase comparisons, regular expressions (RegEx/RegExp) and edit distance. Examples of such implementations include:
- Levenshtein Distance: This algorithm calculates and returns the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into another.
- Damerau–Levenshtein Distance: This algorithm is exactly like Levenshtein distance with one exception; it includes transpositions amongst its edits.
- Jaro–Winkler Distance: The Jaro distance between two words is the minimum number of single-character transpositions required to change one word into the other.
- n-gram: This is a contiguous sequence of n items from a given sequence of text or speech. The items can be phonemes, syllables, letters, words or base pairs according to the application.
- Soundex: This is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
- However Your Brain Works: "Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteers be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe."
Flookup runs on the principles behind n-gram and Soundex.