Merges can be used to consolidate records from two or more tables. In this page we will work through a simple merge scenario to explain how merges work. All resource used in this example are available here:
We have two sets of company information, Businesses 1.xlsx and Businesses 2.xlsx, that we want to merge together and remove duplicates. There are a number of challenges we must resolve to achieve this:
The attribute names in each file are different.
There are duplicate companies that exist in both files.
There are duplicates within the Businesses 1.xlsx file.
Solution
Import the data
Using a file collector import each of the Busiesss Excel documents into their own table and run analysis on each to populate them with data.
Make both of the tables static using the Static option from the table hover over toolbar.
This ensures we only read the files once.
Create the Merge
From the Create section of the analysis toolbar, click Show Tables
drag the Merge Table onto your model.
In the settings, set the Name for example Businesses Merged.
ClickOK to save and close the settings for the merge table.
Connect the Inputs
In the analysis model, hover over the Businesses 1 table.
In the hover over toolbar, select Connector, then click the end of the arrow onto the merge table. This connector will pipe data into the merge table.
In the properties window that opens for the pipe in, set the name to B1 to better indicate the use of the pipe. It is good practice to utilise meaningful names.
In theSort/Groupsection click Show Attributes and PhixFlow displays the list of attributes (columns).
Drag
Similarly, add a connector from the table Businesses 2 to the merge.
Configure groupings on the pipes
Double-click on the pipe fromCustomerAddresses toCustomer All Details to open its settings.
In the Basic Settingssection, set:
Name: this pipe is automatically named in.
In theSort/Groupsection click Show Attributes. PhixFlow displays the list of attributes (columns).
DragCustomerReffrom the list into the settings.
Close the list of attributes.
Double-click on the pipe fromCustomerAddresses toSOURCE_CUSTOMER_PHONE_NUMBERS. This pipe will be calledin_2.
In theSort/Groupsection, click Show Attributes. PhixFlow displays a list of the attributes (columns) in the CustomerAddresses data.
DragCUSTOMER_REFinto the Sort/Groupsection.
Close the list of attributes.
Now add attributes fromCustomerAddresses to the merge stream:
Hover over CustomerAddresses and in the pop-up toolbar, click Show Attributes.
Drag all attributes from this list and drop them onto the stream icon for Customer All Details.
How to select all the attributes in a list
To select all attributes, you can:
either click the first attribute then Shift+click the last attribute
or click the check box next to theNameheader.
PhixFlow asks for confirmation. If it is not already selected, select Use original attribute name. Then click the arrow to confirm.
Similarly, drag all attributes fromSOURCE_CUSTOMER_PHONE_NUMBERStoCustomer All Details.
To see the resulting settings forCustomer All Details, double-click its icon to open the settings tab and look at theAttributessection. You can see a list of 8 attributes (data columns).
Run analysis and view the stream data
To run analysison your new stream, hover your mouse pointer over the streamCustomer All Details, and in the pop-up toolbar, click Run Analysis.
To view the data you just loaded, hover over the stream and click Show view. In the drop-down list, selectDefault View. You will see the address data you loaded from the file combined with the phone numbers you loaded from the database.