8 Merging data sets directly from database tables
- Chris Welford
- Anthony George
- Fiona Sargeant (Unlicensed)
By the end of this chapter you will be able to:
- Merge directly from database collectors
- Populate attributes with key values using the '_key[n]' notation
In this exercise you will repeat the comparison you did in the previous exercise, of the full channel list in your model against the list in the inventory database – but this time using a merge:
- Add a new merge table to your model, by dragging in Merge Table from the analysis toolbar.
- Set this table Name
Channel List Reconciliation 2
- Add an input pipe from
Latest Package All Channels
, with Namesales
- Drag the following attributes from
Latest Package All Channels
toChannel List Reconciliation 2
:CustomerRef
Channel
- Add an pipe from the database collector
SOURCE_ALL_CHANNELS_LIST
toChannel List Reconciliation 2
Call this pipe
inv
- Set up the grouping attributes on the input pipes:
- Double click on the pipe
sales
- Go to the Sort/Group section.
- Press Show Attributesand drag in the attributes
CustomerRef
andChannel.
- Press OK to save your changes.
- Double-click on the pipe
inv
- Go to the Sort/Group section.
- Press .
- In the new grouping attribute form that pops up:
- Select the attribute
CustomerRef
from the drop down list. - Tick the Group flag
- Press OK.
- Select the attribute
- Add the grouping attribute
Channel
in the same way. - Back in the main pipe configuration form, press OK.
- Double click on the pipe
Now you will complete the configuration of the merge table, using the _key internal variable to write key values into the table:
- Double click on
Channel List Reconciliation 2
to open the configuration form. - Double click the attribute
CustomerRef.
- In the attribute details form that pops up, update Expression to:
_key[1]
- This writes the first key value into this attribute.
- Press OK
- Double click the attribute
Channel
- In the attribute details form that pops up, update Expression to:
_key[2]
- This writes the second key value into this attribute.
- Press OK.
- Add an attribute
SalesCheck
, and set the Expression to
- In the attribute details form that pops up, update Expression to:
if (countElements(sales) == 1, "YES" , "NO" )
3. Add an attribute InvCheck
, and set the Expression to:
if (countElements(inv) == 1, "YES" , "NO" )
4. Press OK in the main table configuration form to save your changes.
5. Update the query in the database collector to:
select * from SOURCE_ALL_CHANNELS_LIST order by CustomerRef, Channel (the ordering is needed to support the merge)
When merging directly from database tables –you have to repeat the ordering of data across the query and the pipe; that is, the ordering of data in the database collector query must match the grouping on the pipe from the database collector |
6. n Analysis on Channel List Reconciliation 2
7. In your results you should see an additional 3 records that were not included in the reconciliation you did in the previous exercise – channels that are recorded in the inventory, but not in the channels list, are included in your output.
This is because a merge will generate an output record for every key value it finds across all input pipes, whereas if you do a data enrichment using a lookup pipe – you will only get an output record for every record from the input pipe.
Filter unmatched rows
To help you find unmatched rows in your output data, you will now add a filter:
- Open the recordset you have just generated in
Channel List Reconciliation 2
- Create a filter:
- Where ANY of the following are true
SalesCheck
equalsNO
InvCheck
equalsNO
- Where ANY of the following are true
- Press - Apply the filter without saving
- You will get a list of all non-matched records, that is, all records where the
SalesCheck
value isNO
or theInvCheck
value isNO.