PhixFlow Help
Understanding Streams and Pipes
Pages in this topic:
- Which Stream to Use
- Stream Set List
- Copy or Move Stream Data
- Using Directed Pipes to Read from Large Data Sets
- Properties:
- Rolling back stream set:
See also: Getting Data Into and Out of Analysis Models
Types of Stream
There are several types of stream:
Calculate Stream
Calculate streams are the most basic stream type in PhixFlow. An output record will be produced for each input record.
Merge Stream
Merge streams combine sets of input data. In each input pipe a grouping is defined, and an output record is produced for each key value combination that is produced by this grouping applied across all inputs.
Aggregate Stream
Aggregate streams aggregate input data. In the input pipe a grouping is defined, and an output record is produced for each key value combination that is produced by the grouping.
Simple aggregations are better performed using aggregate pipes.
Aggregate streams are functionally identical to merge streams, but by convention, when there is only one input, an Aggregate Stream is used - this displays as a on the model view. Often this helps to clarify the purpose of the stream in the model.
Cartesian Stream
Cartesian streams perform a cartesian join across all inputs. Although this can be useful in some cases, mostly it is easier and simpler to multiply output records with either an output multiplier - which can be configured for any stream type - or to use a multiplier pipe.
CalculateBySet Stream
Calculate by Set streams are like calculate streams in that an output record is produced for each input record. But in addition a grouping can be configured on the input pipe which allows, for each record processed, related rows to be included in calculations.
Publishing Streams
When you make changes to a stream's properties or its attributes, PhixFlow publishes the changes to the stream data tables in the PhixFlow database. This happens automatically in the background. Publishing many streams or streams with many attributes can take some time, and may slow performance.
If the stream properties are set incorrectly, PhixFlow will not be able to publish the stream data to the database. If this happens, the Console will report the publishing error. PhixFlow will also display an error message if you try to interact with the stream, for example to view its data or to run analysis. You must correct the stream properties, so that PhixFlow can retry publishing the stream.
During the publishing process, PhixFlow may create temporary tables in its database. These are kept for a period, then automatically removed when a system task runs. For information about:
- the system task, see Using Tasks and Task Plans
- configuring the period that temporary tables are kept; see System Configuration → Delete Temp Tables after Days.
To ensure that PhixFlow can publish data changes, its database must have enough space to hold a copy of the largest stream. For the different databases, the space needs to be in:
- Oracle: temporary table space
- SQL Server: temporary file group
- Maria DB: the file system.
Please let us know if we could improve this page feedback@phixflow.com