FUZZY MATCHING AND LOOKUP FUNCTIONS FOR GOOGLE SHEETS
YOUR DATA QUALITY PROBLEM
If you deal with merging multiple sources of data, data migration, importing data into your CRM, etc. then you have undoubtedly had data quality issues in your work. According to this infographic by IBM, poor data quality costs the United States economy about $3.1 trillion every year. This easily translates to no less than $100 per record, on average.
Data quality issues might arise when two sources of data, that have similar records, are merged. This creates duplicate records that lead to lower productivity, expensive budgets, low customer satisfaction and inefficiency. These factors consequently affect marketing, finance and general business operations negatively.
You might also face data quality issues when comparing or extracting information from different sources. This task is made harder if your data set contains typos, spelling variations, punctuation marks and the like. Having correct and up-to-date records in your data is critical to the success of your business and Flookup can help you achieve that easily.
In summary, Flookup will help you increase productivity, increase efficiency, reduce your error rate and save money.
Flookup is an advanced tool for performing fuzzy match and fuzzy lookup tasks in Google Sheets. With Flookup, you can perform the following actions regardless of variations in spelling or the existence of typos and partial matches in your data:
- Remove duplicates.
- Highlight duplicates.
- Lookup matches of strings.
- Calculate the percentage similarity between strings.
- Find the sum or average of numbers that correspond to partial matches.
- Merge cells that contain partial matches from a single column.
- Extract unique values from any column.
Flookup functions are not inhibited by the sort order of columns or rows, punctuation marks or the letter case of the text, thereby saving you from the time and hard work it takes to clean or normalize your data.
FLOOKUP IS CURRENTLY USED BY 39000+ INDIVIDUALS ACROSS 90+ COUNTRIES!
TEN REASONS WHY YOU SHOULD CHOOSE FLOOKUP
- It is simple. Flookup has familiar, intuitive and easy-to-use functions, making it easy to learn and to use.
- It is secure. Flookup is a verified add-on that does not expose, store or share any of the spreadsheet data it processes, meaning that your data remains private and for your eyes only.
- It is fast. Flookup runs on one of the fastest fuzzy matching algorithms in the world and all its functions are optimised to complete tasks in the shortest time possible.
- It is reliable. Flookup is powered by a battle-tested algorithm with millions of rows and columns of data under its belt.
- It is flexible. You can search any column with FLOOKUP and search any row with HFLOOKUP, not just the leftmost column or topmost row.
- It is dynamic. If the first match does not suit your needs, you can instruct Flookup to return the next best match until all possible matches have been exhausted.
- It has range. You can combine any number of strings as the lookupValue, giving you options to increase the specificity of your query.
- It speaks your language. Flookup is not limited to English; it can process text in most international languages.
- It offers value. Flookup not only free to use, but it also unlocks the most affordably priced premium experience when you subscribe.
- It is on the G Suite platform. Working out of the G Suite platform is full of benefits for individuals and organisations alike. Using Flookup taps directly into those benefits.
GETTING STARTED WITH FLOOKUP
- Install Flookup by clicking here.
- Activate premium features by using the secure form in your spreadsheet menu located at: Add-ons > Flookup > Manage subscription.
- Visit the Tutorial and Documentation to fully familiarise yourself with Flookup.
- Review the FAQ and Changelog in case you have any questions about Flookup.
NOTE: To view your subscription status, go to your spreadsheet menu and click: Add-ons > Flookup > About Flookup.
POPULAR FUZZY MATCHING ALGORITHMS
Fuzzy string matching (also called partial or approximate string matching) is a technique for comparing strings that might have a less than 100% match. There are different techniques that are applied by fuzzy matching algorithms and the most popular involve the use of wildcard characters (in combination with functions like VLOOKUP or INDEX/MATCH), word or phrase comparisons, regular expressions (RegEx/RegExp) and edit distance. Examples of such implementations include:
- Levenshtein Distance: This algorithm calculates and returns the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into another.
- Damerau–Levenshtein Distance: This algorithm is exactly like Levenshtein distance with one exception; it includes transpositions amongst its edits.
- Jaro–Winkler Distance: The Jaro distance between two words is the minimum number of single-character transpositions required to change one word into the other.
- n-gram: This is a contiguous sequence of n items from a given sequence of text or speech. The items can be phonemes, syllables, letters, words or base pairs according to the application.
- Soundex: This is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
- However Your Brain Works: "Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteers be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe."
Flookup runs on the principles behind n-gram and Soundex.