EXTRACTING UNIQUE VALUES: UNIQUE VERSUS ULIST
Creating a clean record for datasets like contact addresses, marketing leads, employee records, etc. is critical to operations of any business. It vastly improves efficiency and reduces costs that would otherwise be high if left ignored. Here, we shall look at two different ways we can remove duplicates from any list when faced with such a challenge.
UNIQUE by Google
Google provides a default function called UNIQUE that can be used to remove duplicates.
Let us assume you have a dataset like the one below:
In the example above, Column A contains some duplicate text entries. To extract unique values from this list, type the following formula in Cell B2 and press ENTER/RETURN: =UNIQUE($A$2:$A$15)
This will return a list of unique values as shown below:
UNIQUE has two optional parameters i.e. "by_column" and "exactly_once". You can read more about them in the official documentation.
This function might be convenient and sufficient for some cases but, unfortunately, real-world data is rarely this well-behaved. Instead, you are more likely to come across something that looks like this:
If you take a close look at the table above, you will notice that I have adjusted the duplicates we had before so that, this time, they will have different spellings. In such cases, the UNIQUE function will fail to extract the duplicates.
ULIST by Flookup
Fortunately, Flookup uniquely suited to handle this kind of problem. To extract duplicates from this dataset, we shall use the ULIST function.
Type the following formula in Cell B2 and press ENTER/RETURN: =ULIST($A$2:$A$15)
Your data should look like this afterwards:
In the example above, ULIST considered any text entries with a 0.6 level of similarity to other text entries in the list, to be duplicates and this is its default behaviour. We are, therefore, left with a list of unique values despite having strange spelling differences in our original list.
ULIST has two optional parameters i.e. "indexNum" and "threshold". The former allows you to determine which column to extract unique values from and the latter allows you to set the level of similarity between text entries, above which they will be considered duplicates. You can read more about them in the official documentation.
Please note that ULIST will work down the list vertically, so it will retain unique values based on which text entry appeared first.