3 Combining Data Sets
- Chris Welford
- Anthony George
- Fiona Sargeant (Unlicensed)
By the end of this chapter you will be able to:
You will also learn how to use an internal variable, _pipeName.
In this exercise you will produce a single list of updates to customers' Cable TV packages. Customers can update their packages both through a customer care system and in shops. You will bring all these updates together into a single list.
Read in data sets from files
For this exercise you need two files from the zip file you downloaded in 2 Using Excel Templates, step 5. The files contain updates from the two sources: the customer care system and the shop.
<path>\train\inputData\ChannelPackages\CustomerCareUpdates\CustomerCareUpdates.txt
<path>\train\inputData\ChannelPackages\ShopUpdates\ShopUpdates.txt
- Create a new model with the Name,
Channel Package Check
. - Add a file collector to your model and load the data as follows:
- From the model toolbar, drag a File Collector into the model.
- PhixFlow displays the settings tab for the new file collector. Set the Name to
Customer Care Updates
. - In the model, hover your mouse pointer over the
Channel Package Check
file collector to display the context toolbar and click Upload File. - Navigate to the directory
<path>\train\inputData\ChannelPackages\CustomerCareUpdates
. - Select the file
CustomerCareUpdates.txt
, click Open, then click Upload File. - PhixFlow adds a new table to the model, called
CustomerCareUpdates
. - In the model, hover over the new
CustomerCareUpdates
table and click Run Analysis - Check the data is loaded. There should be 8 lines of data, with columns for Customer Ref, Sales Date and Package.
- Now the data is loaded into the table, set the table to Static.
- Add another file collector to your model and repeat the process described above, this time:
- Set the file collector Name to
Shop Updates
. - Upload
<path>\train\inputData\ChannelPackages\ShopUpdates\ShopUpdates.txt
. - PhixFlow loads 7 rows of data to this table.
- Remember to set the table to Static.
- Set the file collector Name to
Screenshot of the Channel Package Check
model so far:
Combine the data sets
- Create a new table in your model and populate its attributes:
- Hover over
CustomerCareUpdates
and click Create New Table. - In the settings, set the Name to
Combined Updates
. - In the model, show the table attributes for
CustomerCareUpdates
. - Select all the attributes and drag them into the properties for the
Combined Updates
table. Drop them into the Attributes section. - Click OK.
- Hover over
- To save your model, in the model toolbar click Save.
- Connect the
ShopUpdates
table to theCombined Updates
table:- Hover over
ShopUpdates
and click Connector. Click
Combined Updates
to link the pipe to the table.When you connect tables, PhixFlow automatically adds a reference to the input pipe name. This appears in the attribute expression. To merge this data successfully, you must update the attribute expressions to remove the
in.
prefix; see step 4 below.
- Hover over
- Fix all the attribute expressions.
- Double-click on
Combined Updates
to open its settings tab. For each attribute: - In the Attributes section, double-click on an attribute to open its settings.
- In the Basic Settings section, edit the Expression to remove
in.
and click OK
- Double-click on
- In the
Combined Updates
table settings tab, click OK to save and close. - In the model, hover your mouse pointer over
Combined Updates
and click Run Analysis. - When the table has run, check that the table has:
- the columns CustomerRef, Sales Date and Package
- 15 rows
- all cells have data.
The default view:
Snapshot of the Chanel Package Check model so far:
Record the source for each update
You will now update your settings to record the source for each update, in the set of combined updates. To do this:
- To update the name of the pipe from the
CustomerCareUpdates
table to theCombined Updates
table:- In the model, click on the pipe to open its settings tab.
- Set Name to
CC
. - Click OK to save and close the pipe settings.
- Update the name of the pipe from the
ShopUpdates
file collector to theShopUpdates
table.- In the model, click on the pipe to open its settings tab.
- Set Name to shop
- Click OK to save and close pipe settings.
- Double-click the table
Combined Updates
to open its settings tab. - In the Attributes section, click Create New add a new attribute and set:
- Name:
Source
. Expression:
if (_pipeName == "CC", "Customer Care" , "Shop" )
- Click OK to save and close the attribute settings.
- Name:
- To make
Source
the first attribute in the table, drag it to the top of the attributes list. - Click OK to save and close the table settings.
- Run analysis on
Combined Updates
. - Check the output data set and make sure that the source has been recorded on each record correctly.