PREPROCESS DATA BY TEXT SIMILARITY

Introduction to Data Preprocessing

In this section, you will learn how to use two powerful Flookup functions that can make your data cleaning easier, faster and prone to fewer errors: NORMALIZE and FUZZYMATCH.

NORMALIZE can improve the quality and consistency of your data by removing or formatting text entries that might interfere with the fuzzy matching process.

FUZZYMATCH can help you understand your data better by showing you how similar your text entries are. It also gives you a glimpse of the underlying mechanism that drives the other Flookup functions.

NORMALIZE

Normalize function modes can be divided into two broad categories:


To normalize text entries in the first category of functions, follow the steps below:

NORMALIZE Modes Explained

Key Points on NORMALIZE

FUZZYMATCH

Key Points on FUZZYMATCH