/
Merging Similar Data Sets

PhixFlow Help

Merging Similar Data Sets

Scenario

Combining 2 sets of data that are a similar size and have a common key. For each pair of matching records from the data sets, a single record is produced in the output.

Example

Comparing a stream of thousands of invoice totals with a stream of thousands of payments for each customer.

In order to merge 2 sets of data into 1 set of data (the equivalent of a join in SQL), add a pull pipe from each of your source streams to the result stream. Group your pipes by the key fields. The key fields should contain the same sort of data in the same format in both source streams, but do not have to have the same attribute names. The below screenshot shows 2 pulls pipes that have been configured to create a merge on 3 key fields.

Set your result stream to the stream type 'Merge'.

The attribute expressions for your key fields will be _key[n], where n is the order number of your key field grouping.

All other attributes will be referred to by prefixing their pipe name. For example, to return the contact name from the contacts pipe, you would write 'contacts.ContactName'.

The below screenshot shows the configuration of attributes in a merge stream, which is taking data from the activity and contacts pipes.

Watch out for the multiple records in each group. You will either need to:

  • choose a record in each group (e.g. contacts.1.ContactName will choose the first record in the group),
  • use an aggregate function to find the best record in the the group (e.g. max(contacts.ContactName),
  • or tick 'multiplier' on the pipe, which will cause the resulting stream to display one row per record in the group, rather than one record for the group.

If you set one of the pipes to mandatory, then you will only get all records from that pipe and only records from the other pipe that match with data from the mandatory pipe.

Please let us know if we could improve this page feedback@phixflow.com