HOW TO SCHEDULE DATA CLEANING FUNCTIONS
Introduction to Scheduling Functions
Automate your data cleaning by scheduling Flookup's powerful functions to run on their own. This feature uses Google Apps Script triggers to keep your datasets consistently clean and up-to-date, saving you time and reducing errors with minimal manual effort.
You can set core operations like Fuzzy Match and Extract Unique Values to run at flexible intervals, from once an hour to once a month, perfectly tailoring the automation to your workflow. Before scheduling, it is best to run the function manually on a small sample of your data to find the optimal parameters.
How to Schedule a Function
- Open the scheduling sidebar
Navigate to Extensions > Flookup Data Wrangler > Schedule data cleaning functions in your spreadsheet menu. -
Select the main function
From the top dropdown menu, choose the primary operation you want to automate. The form will dynamically update to show relevant options. The available functions are:- Fuzzy match by percentage: Finds the best match in a table based on text similarity.
- Fuzzy match by sound: Finds matches based on phonetic similarity i.e. how words sound.
- Extract unique values by percentage: Creates a unique list from a column based on text similarity.
- Extract unique values by sound: Creates a unique list based on phonetic similarity.
- Standardize text entries: Cleans text by removing stop words, punctuation or diacritics.
- Compare string similarity: Calculates the similarity score between two columns of text.
-
Choose the processing mode
- Process data to the end: The task will run until all rows in your input range are processed and then it will stop automatically.
- Process data in a loop: The task will process the input range and then repeat the process on the next scheduled run. This is ideal for data that is constantly updated.
- Set the required ranges and parameters
For range inputs e.g.Lookup_values,Table_values,Input range, first select the desired range directly in your Google Sheet. Then, click the corresponding Grab selected range button in the sidebar to populate the input field.
Adjust other parameters that appear, such as:- Lookup_column: Specify the column index e.g. "1" in your
Table_valuesrange to search for a match. - Return_column: Specify the column index in your
Table_valuesrange from which to return a result. - Stop_array: Optionally provide a cell range containing custom words or punctuation that you want to remove during standardization.
- Threshold: Set the minimum similarity score, from "0.0" to "1.0", required for a match between the lookup values and the table values.
- Lookup_column: Specify the column index e.g. "1" in your
-
Set the schedule frequency and details
- Click HOURLY or DAILY to set the basic interval.
- For daily schedules, set the approximate Time of day and the number of days between runs e.g. "1" for every day.
- For hourly schedules, set the number of hours between runs e.g. "6" for every 6 hours.
- Lock the output position
Select the cell in your Google Sheet where you want the results to start. Then, click the Grab selected cell button in the sidebar to lock this output position for each run. - Schedule the function
Click Schedule to create the trigger and close the sidebar.
Notes About Scheduling Functions
- To stop or cancel a schedule, select your function and click Reset. This will delete the trigger and clear all parameters.
- Hourly schedules may exhaust your daily quota i.e. 90 minutes for Consumer accounts, 6 hours for Workspace accounts.
- Do not change the sheet name or modify parameters after scheduling.
Issues You Might Encounter
- Triggers fail to run: Triggers have a six minute execution limit per hour, shared across all triggers. If multiple triggers refresh at once, some may be skipped or delayed.
- Excessive Google triggers: There is an undocumented limit of 300 triggers per user per project. Reduce triggers if you encounter issues.
- Inaccurate function parameters: Scheduled functions may fail if parameters are incorrect.
- Failed authorisation: Triggers run under the authority of any user with editing rights or who has opened the sidebar at least once. If a trigger does not run, check user permissions.
- Processed data appears in the wrong sheet: Schedule only one function per spreadsheet per user to avoid quota errors.
For the Visual Learners
Labels might differ slightly but the steps are the same.