Inexact Matching DateTimes
Scenario
Record written to different systems are timestamped when they are written. Timestamps may not match for a variety of reasons (system clocks out of sync, delays in writing, batch processing etc), but there may still be a need to reconcile an match records.
Create 2 streams, with at least 1 DateTime field, and one other field to act as a key. Merge the two streams such that you pick up the records where the keys match and the datetime values are within a set tolerence range.
Solution:
- Create a stream for each input.
- Create an calculate stream to perform the join.
- Make the pipe from one of the streams a pull pipe and the pipe from the other is a lookup pipe.
- In the join stream, for each record on the pull pipe, cycle thru all the records from the lookup pipe for that key and flag those records where there is a match based on a date range.
- Date ranges should be calculated based on converting dates to integers. Note : there is a dateDiff function in PhixFlow that will give the difference in milliseconds between 2 dates