WHY AUTOMATE DEDUPLICATION?
Duplicates silently reduce trust in your data and cost teams time. Manual cleanup is error-prone and doesn't scale — automation makes deduplication repeatable, auditable and schedulable.
SAMPLE DATASET AND GOALS
We'll use a contacts sheet with name, email and phone columns. The goal: find likely duplicate rows, mark them with a confidence score, and optionally merge or flag them for review.
APPS SCRIPT WORKFLOW (COPY-PASTE)
Drop this script into Extensions → Apps Script in your sheet. The example below reads rows from a Contacts sheet, computes match signals (name normalisation, email similarity and optional phonetic keys), and writes back a match confidence score and suggested merge action in adjacent columns.
Key ideas:
- Generate lightweight match keys for fast pairwise scans.
- Use Flookup (add-on or API) to compute high-quality similarity when available.
- Write suggestions to a review sheet to avoid destructive edits.
- Batch work to stay within Apps Script quotas.
/**
* Dedupe runner: read rows, compute signals, call Flookup API for scored matches.
* Adjust column indices to match your sheet. This writes results to columns F..H.
*/
function runDedupe() {
const ss = SpreadsheetApp.getActive();
const sh = ss.getSheetByName('Contacts');
const lastRow = sh.getLastRow();
if (lastRow < 2) return; // nothing to process
const rows = sh.getRange(2,1,Math.max(0,lastRow-1),4).getValues();
const out = [];
for (let i=0;i 0) {
sh.getRange(2,6,out.length,3).setValues(out);
}
}
INTEGRATING WITH FLOOKUP
Flookup provides higher-quality similarity scores and contextual matching than simple key comparisons. Use the add-on for quick, UI-driven workflows or the Flookup API for scheduled, programmatic deduplication.
When calling the API from Apps Script: keep requests small (batch 10-50 records), cache responses where possible, and store the raw match payload in a review sheet so humans can inspect suggested merges before applying them.
// Example: minimal UrlFetchApp call (already shown in runDedupe above)
// Remember to protect your API key and store it in PropertiesService rather than inlined.
SCHEDULING, NOTIFICATIONS AND APPROVALS
Add a time-based trigger to run the dedupe regularly. Best practice: run a conservative scoring pass daily that writes suggestions to a Review sheet, then run a weekly manual approval that applies merges. You can also email a short summary when the number of high-confidence matches exceeds a threshold.
function scheduleDedupe(){
ScriptApp.newTrigger('runDedupe').timeBased().everyDays(1).atHour(2).create();
}
For notifications, generate a CSV summary and email it via MailApp.sendEmail() or integrate with Slack using an incoming webhook.
TESTING, ROLLBACK AND METRICS
Always run the script on a copy first. Store suggested edits in a separate sheet with a proposed_action column. Keep a change log (timestamp, user, rows affected) and provide a one-click rollback that replays the log to revert changes.
Track metrics: matches found, merges applied, false positive rate (from manual review). These numbers inform threshold tuning.
FINAL THOUGHTS
Automating deduplication saves time and reduces risk — but start slow. Use suggested matches in a review workflow, monitor false positives, and tune thresholds. When ready, automate the merge step for high-confidence matches only.
If you'd like, I can add a downloadable Apps Script file and a sample Google Sheet to the post so readers can copy/paste and test immediately.