Standardise Phone Numbers Without Writing Formulas
The Problem: Phone Number Format Chaos
Phone numbers should be simple. Dial the number, it connects. In theory. In practice, phone numbers stored in spreadsheets are a chaos of formats.
Your CRM might store a number as +1-555-123-4567. A CSV import uses (555) 123-4567. Legacy systems store 5551234567 without any separators. International contacts are stored as +44 20 7946 0958. Some entries include extensions: 555-123-4567 x123. Others include notes: 555-123-4567 (main line).
When these numbers flow into your phone system, email integrations or appointment scheduling tools, the mismatches cause problems. Your phone system cannot dial (555) 123-4567 because it expects +15551234567. Your email integration cannot find phone numbers wrapped in text. Your API rejects formats it does not recognise.
The business impact is real: contact lists become unreliable, customer communications are delayed and integration failures go unnoticed until they break critical workflows.
Why Phone Number Formulas Are Complex
The obvious solution is to write a formula. Unfortunately, phone number formatting is deceptively complex.
Approach 1: Simple Removal tries to strip all non-digit characters: =REGEX(A2,"[^0-9]","","g"). This works for basic cases but loses important information. Is 15551234567 a US number with country code or missing a digit? You cannot tell.
Approach 2: Conditional Logic attempts to detect the format, remove extraneous characters, then reformat: if it starts with +1, keep it; if it is 10 digits, add the country code; if it has 11 digits, check if the first digit is 1. This logic gets long and fragile. Every new format requires code modifications.
Approach 3: Complex Regex uses regular expressions to capture different phone number patterns and reformat them. For a single country, this works. For international numbers, multiple formats and legacy data? The regex becomes unmaintainable.
All three approaches share a common problem: they anticipate format variations upfront. When a new variation appears, the formula breaks or produces incorrect results.
Introducing Learn from Examples
The Learn from Examples feature takes a different approach. Instead of writing rules, you show the system what you want.
You provide a few examples of phone numbers in various formats and what the standardised version should be. The pattern synthesis engine analyses these examples, identifies the underlying transformation logic and applies it to your entire contact list. Complex phone number logic emerges from simple examples.
Case Study Walkthrough
Step 1: Identify Format Variations
The telecommunications company had over 85,000 phone numbers across multiple contact systems. They identified the most common formats: domestic US numbers with various separators, international numbers with country codes and legacy numbers without formatting.
Step 2: Create Example Pairs
In a dedicated sheet, the team created two columns. Column A contained the messy phone numbers exactly as they appeared. Column B contained the desired standardised format: country code plus area code plus local number, all digits with spaces between sections.
| Dirty (Original Format) | Clean (Standardised) |
|---|---|
+1-555-123-4567 |
+1 555 123 4567 |
(555) 123-4567 |
+1 555 123 4567 |
555-123-4567 |
+1 555 123 4567 |
5551234567 |
+1 555 123 4567 |
+44 20 7946 0958 |
+44 207 946 0958 |
555-123-4567 x123 |
+1 555 123 4567 |
Step 3: Test the Pattern
The team selected both columns and clicked Test. The pattern synthesis engine detected: Remove all non-digit characters. If missing country code, add +1. Reformat to country code plus 3-digit area code plus 3-digit exchange plus 4-digit number, separated by spaces. Remove extensions.
A preview showed how the system would transform a sample of contact numbers. The results matched the intended output.
Step 4: Configure and Schedule
The team selected their full phone number column as input and set the output range. They configured the schedule to process all 85,000 numbers in one batch (the auto-chaining feature handled the execution limit) and set it to repeat monthly for new incoming contacts.
Step 5: Verify and Deploy
The first run completed in approximately 4 minutes. All 85,000 phone numbers were standardised to the consistent format. The team spot-checked 50 random rows and verified accuracy. International numbers, US numbers, legacy formats and numbers with extensions were all handled correctly.
How Pattern Synthesis Works
The Learn from Examples feature applies sophisticated analysis to identify phone number transformation logic:
- Analyse the Differences: The system compares each dirty number to its clean counterpart and identifies what changed. Dashes removed? Parentheses stripped? Characters added or reordered?
- Extract Common Patterns: It looks across all your example pairs to find patterns common to all transformations. If 100 percent of your examples strip non-digit characters and add country codes, that becomes part of the learned rule.
- Build a Rule Pipeline: The system constructs a pipeline of transformations (remove characters, reorder sections, add prefixes, trim suffixes) that, when applied in sequence, standardise all variations to your desired format.
- Validate Against Examples: Before applying the rule to your full contact list, the system verifies that the discovered rule correctly transforms all your example pairs.
- Apply to Full Dataset: Once validated, the rule is applied to every phone number in your input range.
The Business Impact
For the telecommunications company, the standardisation delivered measurable improvements:
| Metric | Before Learn from Examples | After Learn from Examples | Improvement |
|---|---|---|---|
| Phone number format consistency | 47% | 100% | 53 percentage points |
| Phone system integration success rate | 82% | 99% | 17 percentage points |
| Manual format corrections per month | 340 | 12 | 96% reduction |
| Time to process new contacts | Manual review | Automatic | 100% automated |
| API integration errors | 2.3% failure rate | 0.08% failure rate | 96% fewer errors |
The largest operational benefit was eliminating recurring manual format corrections. Customer support no longer needed to request contact information be reformatted. Integrations worked reliably and new contacts were automatically standardised as they entered the system.
Getting Started with Your Own Data
Assess Your Phone Number Formats
Before you start, gather 5 to 7 phone numbers that represent the different formats in your contact list. Note the variations: country codes, separators, extensions, text annotations. This diversity ensures your examples capture the full transformation.
Create Example Pairs
In a new sheet, create two columns. Column A has the messy phone numbers, column B has the standardised versions. Decide on your target format upfront: domestic with country code? International with spaces? Just digits? Your examples should all follow the same target format.
Test and Validate
Use the Test function to preview the detected rule. If it matches your intention, proceed. If not, add another example pair that clarifies the expected transformation and test again.
Schedule and Monitor
Select your full phone number column and configure the frequency. For ongoing contact list updates, choose Process in a loop. Monitor the first execution to verify that all numbers are standardised correctly.
For detailed instructions, see the Learn from Examples documentation.
Frequently Asked Questions
What if my phone numbers include extensions?
If you want to preserve extensions, show that in your examples: 555-123-4567 x123 becomes +1 555 123 4567 x123. If you want to remove extensions, show that: 555-123-4567 x123 becomes +1 555 123 4567. The system learns from your examples.
Can it handle international numbers?
Yes. Provide examples that show your international numbers in the desired format. If you want country codes preserved, show that. If you want them stripped for certain countries, show examples of that too. The system learns the pattern from your examples.
What if I have mixed domestic and international numbers?
You can create one schedule if your standardised format handles both (for example, all numbers have country codes). If your standardised format differs between domestic and international, you may need to create separate schedules for each type or pre-filter your data.
Does it handle numbers with text annotations?
If your examples show annotations being removed, the system will remove them. If some annotations are important and should be preserved, you can show that in your examples and the system will learn which annotations to keep and which to discard.