PhixFlow Help
Building a history with a Self-Updating Stream
By the end of this chapter you will be able to:
- Build a Self-Updating Stream to build a history for sparse input data
In this exercise you will build a history with "sparse" data – that is, data for which, in each new data set, there is a record for only a small portion of the key values.
Files containing billing reports can be found in the input files, in the directory:
…\inputData\input\BillingReports
Each file contains the results from a billing run – and in each billing run, only a small portion of the overall set of accounts is billed.
You will read in a copy of these files on the PhixFlow server, and use the data to build a view of all billing activity across all accounts:
- All a File Collector with Name Billing Reports
- The rest of the configuration of the File Collector is similar to the File Collector you created in exercise 7 – create a sequence number to read in the files
- From the File Collector, create a Stream with Name Billing Reports
- Add a Merge Stream with Name Billed Totals
- Complete configuration of Billed Totals:
- Add a pipe from Billing Reports into Billed Totals, with Name in
- Drag all Attributes from Billing Reports into Billed Totals
- Add a pipe from the Stream into itself
- Edit details for the loop pipe:
- Set the Name to prev
- Set Data to Read to All
This means that all previous Stream Sets will be brought back by this pipe, not just the most recent Stream Set
- Add the grouping key AccountNum
- Make the Stream Self-Updating:
- On the Details tab of the Stream configuration form, tick the flag Track Superceded Date
- In the Using Pipe field that appears, select the pipe prev
- Complete configuration of the loop pipe:
- Add the Filter:
Superceded On is null
This will bring back only the "active" record for each account, that is, only the record for each account which has not been superceded
- In the pipe in:
- In the Details tab, tick the flag Mandatory
- Add the grouping key AccountNum
- In the stream Billed Totals:
- Set the expression on the Attribute AccountNum to _key[1]
- Edit the attribute BillDate – update the name to LatestBillDate
- Edit the attribute BillAmt – update the name to LatestBillAmt
- Add an attribute TotalBillAmt, and set the expression to: prev.LatestBillAmt + in.BillAmt
- In the pipe in:
This expression will add the value from the previous "active" record for the current key value to the new value being read in for this key value
PhixFlow will then mark the previous "active" record as superceded (by setting the Superceded Date to the current date and time)
Since the new record for this key value has a blank Superceded Date, this becomes the new "active" record for this key value
- Save your changes.
Run Analysis on Billed Totals six times. Review the data in the Stream Sets generated – you will see that in each Stream Set, there are records only for the account numbers which appear in the latest data set read in from the new billing report file. That is, there are results only for those account numbers where there is an update to record.
You will now add a Stream to generate an overview of the billing activity across all accounts:
- Create a new Stream from Billed Totals, with Name Billing Latest
- In the pipe:
- Set Data to Read to All
- Add a filter: Superceded On is null
- Also in the Details tab, tick the flag Static
Setting the pipe to Static is not needed to read from a Self-Updating Stream, but it is convenient to use it here because it means that you can generate the report of all billing activity without having to go back and re-run the previous Streams.
- Save your changes.
Run Analysis on Billing Latest. Look at the Stream Set generated. You will see that you have a single record for each account – the "active" record for each account, from Billed Totals, with the latest details for that account.
Please let us know if we could improve this page feedback@phixflow.com