Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Insert excerpt
_Banners
_Banners
nameanalysis
nopaneltrue

Create a merge stream

Introduction

Merges can be used to consolidate records from two or more tables. In this page we will work through a simple merge scenario to explain how merges work. All resource used in this example are available here: 


Scenario

We have two sets of company information, Businesses 1.xlsx and Businesses 2.xlsx, that we want to merge together and remove duplicates. There are a number of challenges we must resolve to achieve this:

  1. The attribute names in each file are different.
  2. There are duplicate companies that exist in both files.
  3. There are duplicates within the Businesses 1.xlsx file.


Solution

  1. Import the data
    1. Using a file collector import each of the Busiesss Excel documents into their own table and run analysis on each to populate them with data.
    2. Make both of the tables static using the 
      Insert excerpt
      _static
      _static
      nopaneltrue
       option from the table hover over toolbar.
      1. This ensures we only read the files once.
  2.  Create the Merge
    1. From the Create section of the analysis toolbar, click 
      Insert excerpt
      _datasource_show_tables
      _datasource_show_tables
      nopaneltrue
    2. drag the Image Modified Merge Table  onto your model.
    3. In the settings, set the Name for example Businesses Merged.
    4. Click Image Modified OK to save and close the settings for the merge table.
  3. Connect the Inputs
    1. In the analysis model, hover over
the CustomerAddresses stream
    1. the Businesses 1 table.
    2. In the
pop-up
    1. hover over toolbar, select Image Modified Connector, then click the end of the arrow onto the
stream Customer All Details
    1. merge table. This connector will pipe data into the
stream.
    1. merge table.
    2. In the properties window that opens for the pipe in, set the name to B1 to better indicate the use of the pipe. It is good practice to utilise meaningful names.
    3. In the Sort/Group section click 
      Insert excerpt
      _attributes_show
      _attributes_show
      nopaneltrue
       and PhixFlow displays the list of attributes (columns).
    4. Drag
    5. Similarly, add a connector  from the
stream SOURCE_CUSTOMER_PHONE_NUMBERS to Customer All Details
    1. table Businesses 2 to the merge.

Configure groupings on the pipes

  1. Double-click on the pipe from CustomerAddresses  to Customer All Details to open its settings.
    1. In the Basic Settings section, set:
      • Name: this pipe is automatically named in.
    2. In the Sort/Group section click Image Modified Show Attributes. PhixFlow displays the list of attributes (columns).

    3. Drag CustomerRef from the list into the settings.
    4. Close the list of attributes.
  2. Double-click on the pipe from CustomerAddresses to SOURCE_CUSTOMER_PHONE_NUMBERS. This pipe will be called in_2.
    1. In the Sort/Group section, click Image Modified Show Attributes. PhixFlow displays a list of the attributes (columns) in the CustomerAddresses  data.

    2. Drag CUSTOMER_REF into the Sort/Group section.
    3. Close the list of attributes.
  3. Now add attributes from CustomerAddresses to the merge stream:
    1. Hover over CustomerAddresses and in the pop-up toolbar, click Image Modified Show Attributes.
    2. Drag all attributes from this list and drop them onto the stream icon for Customer All Details.

      How to select all the attributes in a list

      To select all attributes, you can:

      • either click the first attribute then Shift+click the last attribute
      • or click the check box next to the Name header. 

      Image Modified

    3. PhixFlow asks for confirmation. If it is not already selected, select Use original attribute name. Then click the arrow Image Modified to confirm.
      Image Modified
  4. Similarly, drag all attributes from SOURCE_CUSTOMER_PHONE_NUMBERS to Customer All Details.
  5. To see the resulting settings for Customer All Details, double-click its icon to open the settings tab and look at the Attributes section. You can see a list of 8 attributes (data columns).

Run analysis and view the stream data

  1. To run analysis on your new stream, hover your mouse pointer over the stream Customer All Details, and in the pop-up toolbar, click Image Modified Run Analysis.
  2. To view the data you just loaded, hover over the stream and click Image Modified Show view. In the drop-down list, select Image Modified Default View. You will see the address data you loaded from the file combined with the phone numbers you loaded from the database.