Versions Compared
Version | Old Version 1 | New Version 2 |
---|---|---|
Changes made by | ||
Saved on |
Key
- This line was added.
- This line was removed.
- Formatting was changed.
Insert excerpt | ||||||||
---|---|---|---|---|---|---|---|---|
|
Introduction
Merges can be used to consolidate records from two or more tables. In this page we will work through a simple merge scenario to explain how merges work. All resource used in this example are available here:
Scenario
We have two sets of company information, Businesses 1.xlsx and Businesses 2.xlsx, that we want to merge together and remove duplicates. There are a number of challenges we must resolve to achieve this:
- The attribute names in each file are different.
- There are duplicate companies that exist in both files.
- There are duplicates within the Businesses 1.xlsx file.
Solution
- Import the data
- Using a file collector import each of the Busiesss Excel documents into their own table and run analysis on each to populate them with data.
- Make both of the tables static using the
option from the table hover over toolbar.Insert excerpt _static _static nopanel true - This ensures we only read the files once.
- Create the Merge
- From the Create section of the analysis toolbar, click
Insert excerpt _datasource_show_tables _datasource_show_tables nopanel true - drag the Image Modified Merge Table onto your model.
- In the settings, set the Name for example
Businesses Merged
. - Click Image Modified OK to save and close the settings for the merge table.
- From the Create section of the analysis toolbar, click
- Connect the Inputs
- In the analysis model, hover over
CustomerAddresses
stream- the Businesses 1 table.
- In the
- hover over toolbar, select Image Modified Connector, then click the end of the arrow onto the
Customer All Details
- merge table. This connector will pipe data into the
- merge table.
- In the properties window that opens for the pipe
in
, set the name to B1 to better indicate the use of the pipe. It is good practice to utilise meaningful names. - In the Sort/Group section click
and PhixFlow displays the list of attributes (columns).Insert excerpt _attributes_show _attributes_show nopanel true - Drag
- Similarly, add a connector from the
SOURCE_CUSTOMER_PHONE_NUMBERS
to Customer All Details
- table
Businesses 2
to the merge.
- table
Configure groupings on the pipes
- Double-click on the pipe from
CustomerAddresses
toCustomer All Details
to open its settings.- In the Basic Settings section, set:
- Name: this pipe is automatically named
in
.
- Name: this pipe is automatically named
In the Sort/Group section click Image Modified Show Attributes. PhixFlow displays the list of attributes (columns).
- Drag
CustomerRef
from the list into the settings. - Close the list of attributes.
- In the Basic Settings section, set:
- Double-click on the pipe from
CustomerAddresses
to
. This pipe will be calledSOURCE_CUSTOMER_PHONE_NUMBERS
in_2.
In the Sort/Group section, click Image Modified Show Attributes. PhixFlow displays a list of the attributes (columns) in the
CustomerAddresses
data.- Drag
CUSTOMER_REF
into the Sort/Group section. - Close the list of attributes.
- Now add attributes from
CustomerAddresses
to the merge stream:- Hover over
CustomerAddresses
and in the pop-up toolbar, click Image Modified Show Attributes. Drag all attributes from this list and drop them onto the stream icon for
Customer All Details
.How to select all the attributes in a list
To select all attributes, you can:
- either click the first attribute then Shift+click the last attribute
- or click the check box next to the Name header.
Image Modified
- PhixFlow asks for confirmation. If it is not already selected, select Use original attribute name. Then click the arrow Image Modified to confirm.
Image Modified
- Hover over
- Similarly, drag all attributes from
toSOURCE_CUSTOMER_PHONE_NUMBERS
Customer All Details.
- To see the resulting settings for
Customer All Details
, double-click its icon to open the settings tab and look at the Attributes section. You can see a list of 8 attributes (data columns).
Run analysis and view the stream data
- To run analysis on your new stream, hover your mouse pointer over the stream
Customer All Details
, and in the pop-up toolbar, click Image Modified Run Analysis. - To view the data you just loaded, hover over the stream and click Image Modified Show view. In the drop-down list, select Image Modified Default View. You will see the address data you loaded from the file combined with the phone numbers you loaded from the database.