Fuzzy Match for Excel
Fuzzy matching versions of XMATCH and XLOOKUP, with unlimited free use.
Overview
Provides functions FUZZY.LOOKUP and FUZZY.MATCH which are similar to the native Excel functions XLOOKUP and XMATCH, but are able to perform approximate text matching (aka fuzzy matching) of the lookup_value to the lookup_array or range. Algorithms run in Excel so your data never leaves your spreadsheet, and you have unlimited free use.
FEATURES
🎯 Similar to XLOOKUP and XMATCH.
🆓 Unlimited free use.
🔒 No data is shared outside your spreadsheet.
🚀 Fast algorithm handles large datasets.
✅ Outputs lookup and match for quick comparison.
FUNCTIONS
FUZZY.MATCH (like XMATCH, with fuzzy matching)
➡️ Inputs: lookup_value, lookup_array, [threshold]
⬅️ Outputs: Lookup, Match, Similarity, and Row as custom data type
🔍 Searches for the lookup_value in the lookup_array using fuzzy text matching.
🎚️ Threshold input lets you adjust how close the match must be.
📅 Output is match row in lookup_array, use with INDEX(), etc.
FUZZY.LOOKUP (like XLOOKUP, with fuzzy matching)
➡️ Inputs: lookup_value, lookup_array, return_array, [threshold]
⬅️ Outputs: Lookup, Match, Similarity, and values from matching row in return_array
🔍 Searches for the lookup_value in the lookup_array using fuzzy text matching.
🎚️ Threshold input lets you adjust how close the match must be.
📅 Output return_array retrieves selected columns from the matching row.
APPLICATIONS
- Find and remove duplicates: Assuming you have a table with a column Name, using a formula like =BOARDFLARE.FUZZY.MATCH([@Name],FILTER([Name],ROW([Name]) <> ROW()),0.7) will enable you to see the most similar match in Name, other than @Name, which obviously matches perfectly. This is done using FILTER([Name],ROW([Name]) <> ROW()) to create an array of all the rows except the current row.
- Match leads with your CRM: If you have marketing leads and want to match them with companies already in your CRM, you can use FUZZY.LOOKUP where the lookup_value is the name of the company from the lead, and the lookup_array is the column of all of the companies in your CRM. Setting the return_array to the CustomerID column in the CRM table will return the CustomerID for the matches to enable easy import to your CRM. You can use a similar process to match the names of the leads with contacts in the CRM, since people don't use their business email address when filling out lead forms as that would just make the matching too easy.😉
- Find applicants in HR system: Let's say you have a list of job applicants and you want to see which ones are already in your HR system. Using FUZZY.LOOKUP, the lookup_value is the name of the applicant, and the lookup_array is all of the previous applicants from the HR system. The return_array is the additional columns from the HR system you want retuned with a match (e.g. date of last application, last position applied for, etc.).
- Master Data Management (MDM): Fuzzy matching can be a powerful tool in MDM to identify and link related records across different systems or datasets. This can help in deduplicating, entity resolution, linking, cleansing, and standardizing data, thereby improving the overall quality of your master data.
LIMITATIONS
The main limitation of the current version is that the algorithm works better looking for substrings in lookup_array that match the lookup_value, than the reverse. For example, a lookup_value of “Microsoft” matches a lookup_array element of “Microsoft Corporation” with an almost perfect match similarity of 0.99, because “Microsoft” is a substring of “Microsoft Corporation”. However, when done in reverse, a lookup_value of “Microsoft Corporation” will only match a lookup_array element of “Microsoft” with 0.43 similarity. So where possible, try to use the shorter string as the lookup_value. One way to achieve this is use Excel functions to remove words that are not needed for the matching (e.g. Corporation, Inc., etc.) from the lookup_value.
Also, remember that the accuracy of fuzzy matching, like most things in machine learning, might be awesome in one situation and useless in another.🙂
See Fuzzy Match for Excel on our website for more information.
Installation
The Fuzzy Match tool is available for Excel on Windows, Mac, and the web. It is available on the Microsoft AppSource store at the link below, or can be installed directly from Excel.
Functions
=BOARDFLARE.FUZZY.MATCH
Inputs | Description |
---|---|
lookup_value | text to search for in the lookup_array |
lookup_array | range of text cells containing possible matches |
threshold | minimum similarity: 0 (anything) - 1 (exact match), default is 0.4 |
Outputs | Description |
---|---|
Lookup | original lookup_value |
Match | best match found in the lookup_array |
Similarity | similarity between the Lookup and the Match |
Row | Row number of the match in lookup_array |
=BOARDFLARE.FUZZY.LOOKUP
Inputs | Description |
---|---|
lookup_value | text to search for in the lookup_array |
lookup_array | range of text cells containing possible matches |
return_array | additional columns to return with match |
threshold | minimum similarity: 0 (anything) - 1 (exact match), default is 0.4 |
Outputs | Description |
---|---|
Lookup | original lookup_value |
Match | best match found in the lookup_array |
Similarity | similarity between the Lookup and the Match |
Return Row | matching row in return_array |
Algorithm
The tool is currently using the Bitap algorithm, which uses Levenshtein distance. See Fuzzy Matching for more details on other algorithms we are considering.
FAQ
How is the similarity calculated?
Changelog
Here are the release notes.
1.1 - 2024-05-27 (current)
- Switched to using custom functions, FUZZY.MATCH and FUZZY.LOOKUP.
1.0 - 2024-05-15
- Initial version with task pane only.