THIS HAS BEEN OUR JOURNEY, SO FAR
Hello World 👋
Hello, I'm Andrew Apell and I'm the creator of Flookup Data Wrangler. I have a background in Statistics and over 10 years of experience in Data Analytics. Part of this experience has involved coming up innovative solutions for cleaning dirty data within short periods of time, and one of those solutions was Flookup.
I designed Flookup to be that lean but powerful solution for your data cleaning needs and, having used it personally for 8 years, I'm confident that it will be a valuable asset for you and your organisation.
How Flookup Came To Be
Flookup was created out of necessity. I was part of a team working on a project that involved cleaning and standardising thousands of rows of data. This data was some of the "dirtiest" we had ever come across and so the process of cleaning it usually took about a week for each team member to complete the task manually.
So, in order to improve our results, I introduced my team to the Levenshtein and Damerau-Levenshtein algorithms that I had adapted for the project. This brought about a major improvement in overall accuracy, but it was quite slow. Next, I turned to the Jaro-Winkler algorithm, but we quickly dropped it because the improvement over previous two algorithms was not significant.
Eventually, I settled on developing the initial version of Flookup, which was based on the n-gram model. When the add-on was released, it reduced our task time to 30 minutes and our error rate never exceeded 1% after that point. The great benefits Flookup brought to our team performance compelled me to share it with the world as our first public product and thus Flookup for Google Sheets was born.
While developing Flookup, I drew inspiration from different fuzzy string matching algorithms and reinforced its core with lessons I had learned from our experiences with them.
Flookup for Today and Tomorrow
Flookup Data Wrangler is fiercely independent, debt free and completely bootstrapped. We will never sell or use information from or about our clients for any form of profit, financial or otherwise. We are a fully funded team of two and profitable through our subscription-based model. This frees us to focus on building solutions that put our clients first, without compromise.
Our focus for the near future is to increase the amount data Flookup can process by bypassing Google's timeout policy. We are also exploring Machine Learning with the view of updating the Flookup core algorithm where necessary. In other words, we still have a long journey ahead of us.
In the meantime, you can connect with me on Twitter. Let's journey along together.