PhixFlow Help

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Current »

Inexact Matching Strings

Scenario

Often, data will get out of alignment, especially in cases where the data is being typed into multiple systems. Names, Addresses and other free form entries can contain errors, and matching up records where there are such errors can be very difficult.

Create 2 streams. Join / Merge the two streams together using a text value as the key, but all joins where the 2 strings are "almost" the same.

Solution:

  • Merge Streams based on exact string matches
  • Excluding exact matches, cycle thru unmatched records looking for near matches.
  • Additional features of this pattern may include :
    • Best Match only
    • Consume Matches. i.e. once a match is made, that record cannot be matched against any other record.
    • Auto-Match threshold. i.e. above a certain level of matching (to be defined by the user/client), automatically match records. below that level, create a report / mechanism for manually tagging matches.

 

  • No labels