STANDARDIZE DATA BY TEXT SIMILARITY

Introduction to Standardizing Data

Unlock the power of clean data with Flookup's "Standardize Data" feature, an essential tool for effective data standardization and text normalization within Google Sheets. Whether you need to preprocess datasets by removing punctuation, eliminating unwanted words, stripping diacritics or normalizing URLs, this function streamlines your Google Sheets data cleaning efforts. It can either modify your original dataset in place or create a new, standardized version, leaving your original data untouched. Here is a condensed look at what each function mode does:


Step-by-Step Guide: Removing Punctuation or Unwanted Words

  1. Open the function sidebar
    Extensions > Flookup Data Wrangler > Transformation functions > Preprocess and standardize in your spreadsheet menu.
  2. Select the mode to run
    Choose the mode for the function from the drop down menu.
  3. Select the entries to standardize
    Highlight the column of text entries to standardize and click Grab selected range.
  4. Select the entries to remove
    Highlight the cell containing stop words (comma separated) or punctuation marks (no separation) and click Grab selected range.
  5. Specify the result location
    Click an empty cell to indicate where the results should be displayed. This cell should be in any sheet inside that same workbook.
  6. Standardize the dataset
    Click the Standardize text entries button.

Step-by-Step Guide: Removing Diacritical Marks, Keeping URL Domain or Path

  1. Open the function sidebar
    Extensions > Flookup Data Wrangler > Transformation functions > Preprocess and standardize in your spreadsheet menu.
  2. Select the mode to run
    Choose the mode for the function from the drop down menu.
  3. Select the entries to standardize
    Highlight the column of entries to standardize and click Grab selected range.
  4. Standardize the text entries
    Click the Standardize text entries button.

Important Notes on Data Standardization


Visual Learning Resources

Labels might differ slightly but the steps are the same.


Explore Further