STANDARDISE DATA BY TEXT SIMILARITY

How to Standardise Datasets

This function can either modify the original dataset in place or leave the original dataset unchanged. Here is a condensed look at what each function mode does:


Step-by-Step: Remove Punctuation or Unwanted Words

  1. Open the function sidebar
    Extensions > Flookup Data Wrangler > Transformation functions > Preprocess and standardise in your spreadsheet menu.
  2. Select the mode to run
    Choose the mode for the function from the drop-down menu.
  3. Select the entries to normalise
    Highlight the column of text entries to normalise and click Grab selected range.
  4. Select the entries to remove
    Highlight the cell containing stop words (comma separated) or punctuation marks (no separation) and click Grab selected range.
  5. Specify the result location
    Click an empty cell to indicate where the results should be displayed. This cell should be in any sheet inside that same workbook.
  6. Normalise the dataset
    Click the Normalise text entries button.

Step-by-Step: Remove Diacritical Marks, Keep URL Domain or Path

  1. Open the function sidebar
    Extensions > Flookup Data Wrangler > Transformation functions > Preprocess and standardise in your spreadsheet menu.
  2. Select the mode to run
    Choose the mode for the function from the drop-down menu.
  3. Select the entries to normalise
    Highlight the column of entries to normalise and click Grab selected range.
  4. Normalise the text entries
    Click the Normalise text entries button.

Notes on Standardising Data


For the Visual Learners

Labels might differ slightly but the steps are the same.


Explore Further