Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Matching Data from Different Tables

When Merging two or more Tables, we can use the built in index type to perform different types of merging. This is available on a pipe. For complete details search for the Index Type on the Pipe help page. Here will look at the following two matches:

  1. Exact Match: The pipe retrieves data from its cache based on an "Exact Match" lookup with the values provided.
  2. Near Match: The pipe will retrieve data from its cache based on a "Near Match" lookup, taking into account the numeric value provided by Maximum Number of Edits Expression which sets a tolerance for near match. For example, if the index key is "Smyhte" and the result of the index expression is "Smith" this would still be a match providing that the allowed number of edits is 3 or more because the two values are 3 changes away from one another. Specifically, substitute the 'i' for a 'y', transpose the 't' and the 'h' and insert an 'e' at the end.

This is standard functionality however you can create your own attributes in your tables specifically for matching, for example creating a version of a company name with business suffixes removed such as Ltd and Limited to improve matching. Consider other standardisations that can be applied, such as replacing accented characters with a normalised version so that Anthóny become Anthony which will standardise matching.

When matching with a Near Match it is also possible pass the Maximum Number of Edits Expression a variable, this allows you to allow a number of edits based on the length of your value.

For example, if you hard code a value of 3 for your Maximum Number of Edits Expression, names such as Lee and Zoe would match giving you a lot of false positive matches. Consider using the function stringLength to calculate a percentage change you would allow e.g. 33%, you can use the round function or one of its variants to create an integer.

Maximum Number of Edits Expression
do(
	// Calculate 33% of the String
	$i = stringLength(_out.myString)/0.33,

	// Round the value to an integer
	$i = round($i, 0),

	// Output
	$i
)

Deduplicating Data in a Single Table

self referencing pipes and updates


Checking if a Value Already Exists



  • No labels