Here are the links to pages about streams.
Difference between filter and order/index lookupsSee also:
- Properties for
- dddd
- dd
- dd
- dd
Types of Stream
There are several types of stream.
Anchor | ||||
---|---|---|---|---|
|
Insert excerpt | ||||||
---|---|---|---|---|---|---|
|
Calculate streams are the most basic stream type in PhixFlow. An output record will be produced for each input record.
Anchor | ||||
---|---|---|---|---|
|
Insert excerpt | ||||||
---|---|---|---|---|---|---|
|
Merge streams combine sets of input data. In each input pipe a grouping is defined, and an output record is produced for each key value combination that is produced by this grouping applied across all inputs.
Anchor | ||||
---|---|---|---|---|
|
Insert excerpt | ||||||
---|---|---|---|---|---|---|
|
Aggregate streams aggregate input data. In the input pipe a grouping is defined, and an output record is produced for each key value combination that is produced by the grouping.
Tip |
---|
Simple aggregations are better performed using aggregate pipes. |
Aggregate streams are functionally identical to merge streams, but by convention, when there is only one input, an Aggregate Stream is used - this displays as a on the model view. Often this helps to clarify the purpose of the stream in the model.
Anchor | ||||
---|---|---|---|---|
|
Cartesian streams perform a cartesian join across all inputs. Although this can be useful in some cases, mostly it is easier and simpler to multiply output records with either an output multiplier - which can be configured for any stream type - or to use a multiplier pipe.
Anchor | ||||
---|---|---|---|---|
|
Calculate by Set streams are like calculate streams in that an output record is produced for each input record. But in addition a grouping can be configured on the input pipe which allows, for each record processed, related rows to be included in calculations.
Which Stream to Use
Scenario | Stream and pipe | Example |
---|---|---|
You only have one source and want 1 record per input record. See When to Use a Calculate Stream | Calculate | You have a comma separated file that you want to load into PhixFlow. |
You only have one source, but you want to group the data and only pull back aggregated information for each group. | Aggregate | You want to find the earliest entry in a task list. |
Combine data from 2 sources into 1 set of data. For each record in each data set, you get one record. See Merging Two Data Sets | Calculate and Merge | You have a set of customers stored in one system. You have a set of customers in another system. There are no overlaps. You want all your customers in one list. |
Combining 2 sets of data that are a similar size and have a common key. For each pair of matching records from the data sets, a single record is produced in the output. See Merging Similar Data Sets | Merge | Comparing a stream of thousands of invoice totals with a stream of thousands of payments for each customer. |
Finding records with the same key in a large stream for a large stream of data. For each pair of matching records from the data sets, a single record is produced in the output. See Deduplicating Similar Data Sets | Merge with directed pipe | Finding account details for 1 million records in a reference list of all (~20m) accounts. |
Combining a large stream with data from a small stream, where the values of the small stream will be repeated throughout the result. For each pair of matching records from the data sets, a single record is produced in the output. See Enriching Data with Data From Another Set | Calculate with lookup pipe with order/index set | Find the description for each code in a stream of thousands from a stream containing mapping data. There are only ~100 possible codes. |
Combining a large stream with data from a small stream, where values in the small stream will only used once in the result. For each pair of matching records from the data sets, a single record is produced in the output. See Combining Data Using a Lookup Pipe | Calculate with lookup pipe with filter | You have a stream containing all attendees of an upcoming football match and a small stream of people who are banned from attending matches. |
Combining a large stream with data from a small stream, where the small stream the same rows are repeated throughout the result, but the filter values change slightly. For each pair of matching records from the data sets, a single record is produced in the output. Combining Data Using a Cache Extraction Filter Lookup Pipe | Calculate | You have a price list for 4 different products with different prices between different dates. |
You want to look back at a previous record within a group in a stream, or create a cumulative total per group. You get the same number of records as you put in. See Grouping and Referencing Data Using Calculate By Set Stream | Calculate by set | For a given account, you want to find the difference between each consecutive debit/credit to the account. |
Publishing Streams
Anchor | ||||
---|---|---|---|---|
|
When you make changes to a stream's properties or its attributes, PhixFlow publishes the changes to the stream data tables in the PhixFlow database. This happens automatically in the background. Publishing many streams or streams with many attributes can take some time, and may slow performance.
If the stream properties are set incorrectly, PhixFlow will not be able to publish the stream data to the database. If this happens, the
will report the publishing error. PhixFlow will also display an error message if you try to interact with the stream, for example to view its data or to run analysis. You must correct the stream properties, so that PhixFlow can retry publishing the stream. Insert excerpt _console _console nopanel true
During the publishing process, PhixFlow may create temporary tables in its database. These are kept for a period, then automatically removed when a system task runs. For information about:
- the system task, see Using Tasks and Task Plans
- configuring the period that temporary tables are kept; see System Configuration → Delete Temp Tables after Days.