Stream

Streams are a key model component. They represent a structured store of data within PhixFlow receiving data from one or more components, processing it then storing it.

The toolbar has the standard icons. For information about the sections Parent Details, Analysis Models, Description and Audit Summary, see Common Properties. For information about other property tabs, see Property Tabs.

The tab toolbar has the following additional buttons: Show Stream Sets, Stream Views, Run Analysis.

The following fields are configured for Streams:

Field	Description
Basic Settings
Name	The name of the stream.
Enabled	Whether or not the stream is enabled to run.
Static Data	If ticked, this stream can be used to hold static data. This is usually reference, or "look up" data used as part of a stream calculation. In technical terms, setting this means that the stream will only update itself when the user requests an analysis run on the stream directly: either via the model or via a task plan. Unlike all other types of stream, when this stream is part of a larger analysis run it will not attempt to update itself. From version 8.0.9 onwards, the stream will also update itself via an incoming push pipe that has been triggered.
Period	This is the period of the stream. This can be either a regular period, or variable. There are four possible settings: Transactional: allows multiple users to run independent analysis tasks at the same time. Daily: generate or collect data every day. Monthly: generate or collect data every month. Variable: generate or collect data since the more recent run of the stream to the current date. When the period is first set to Transactional the UID stream attribute will be created if it does not already exist.
Stream Type	The type of function used to generate this stream. Possible types are: Aggregate Stream Calculate Stream CalculateBySet Stream Cartesian Stream Merge Stream
Supersede Items on Pipe	You can select a "loop" pipe - that is, a pipe linking the stream back into itself - in this field. If you do, new records will be compared to existing records, using the selected loop pipe, and if a repeated record is found, the old one will be marked as 'superseded'.
Audit Manual Changes	Only applies when period is Transactional. If ticked, updates and deletes initiated by stream actions (not those carried out by analysis runs) will automatically mark the existing record as superseded and create a new stream set. The new versions of the updated records will be placed in the new stream set. Inserts will simply create a new stream set, and add the inserted record into that stream set. When Audit Manual Changes is first set, the attributes `UpdateAction`, `UpdatedByName`, `UpdatedByID` and `UpdatedTime` will be created if they do not already exist. If you do not require these attributes, delete them. For the `UpdatedByName` attribute, PhixFlow creates a field: of 50 characters in versions up to 8.0.4 of 250 characters in versions from 8.0.5 onwards. `UpdateAction` must be set to the type of action, such as INSERT, UPDATE or DELETE. The other attributes will be populated if they exist on the stream: `UpdatedByName` - the name of the user that performed the update `UpdatedByID` - the internal id of the user that performed the update `UpdatedTime` - the date and time the update was made
Attributes
A list of the stream attributes in the stream. The toolbar on this section has the options: Show list of File Collectors and Shows the list of Streams. To edit the properties for an attribute, double-click the attribute name. To edit only the expression: Right-click an attribute name to display the context menu. Select Edit the expression field. PhixFlow opens a simple text editor box Make changes to the attributes expression. Click to save your changes.
Name	Name of the Attribute.
Type	This can be one of: String Stream#Bigstring Integer Float Stream#Decimal Date Datetime Graphic TrueFalse
Length/ Precision/Significant Figures	For a String, the maximum length of the String. For an Integer or Decimal, the maximum number of digits.
Scale	Only applies to Decimal types. The number of digits after the decimal point; must be less than the number of significant figures.
Local	Only tick this box if this attribute is only required as part of the stream calculation, and it is not necessary to keep the result.
Order	The order of the attributes in the stream. This is important because the stream attribute expressions are evaluated in this order. If the results of an attribute expression, or a $ variable calculated during its calculation, are required in the expression of a second attribute - the second attribute must come after the first in the attribute list.
Expression	The expression used to generate the attribute value. This is written as a PhixFlow Expression. It must evaluate to a single value, of the type specified in the Type field.
Advanced
Indexed	Tick this option if this field should be indexed in the underlying database. An indexed field should be used to increase performance on very large streams in the following situations:- When 1 or more output pipes from the stream uses the field to 'Filter' the stream. When 1 or more output pipes from the stream uses the field in a 'Sort/Group' by action.
Filter conditions are case-independent by default	If ticked, new filter conditions on this field are case-insensitive by default. The filter window → Ignore Case check box inherits this setting; see Filters on Data Views. For case-insensitive filters, there is no difference if the attribute is also indexed. This option affects the behaviour of filters for PhixFlow instances running on Oracle or MariaDB (MySQL) databases. For PhixFlow instances running on a SQL Server database, filters are always case-independent.
Key	For in-memory streams, whether this field will be used as a key value.
Cache Key	If a cache key is set, the value of this attribute persists throughout the stream calculation, rather than being created from scratch for each stream item as normal. This allows you to keep track of the calculation as it progresses. The cache key is an expression that is evaluated for each stream item, and of course it can use the existing value of the attribute, in other words, to value it had in the previous stream item processed in the stream calculation. This allows expressions to use the "persistent" attribute value on subsequent stream items. The expression provided in the Cache Key is evaluated for each stream item so that this "persistent" attribute can refer to multiple value keyed on the "Cache Key".
Description
Description	A free text description of the attribute.
Multipliers and Filters
Input Multiplier	The input multiplier expression should evaluate to a list of one or more values. For each value in the list, the internal variable _inputMultiplier will be set to that value and the whole stream processing will be repeated i.e. the pull pipes will be read and the data from those pipes processed to generate output stream items to be added to the current stream set. For example : do ( $aRange = [], addElement($aRange, rng.RangeFrom), addElement($aRange, rng.RangeTo), $bRange = [], addElement($bRange, $aRange), $bRange ) Where rng.RangeFrom = 500 and rng.RangeTo = 1000, the above example evaluates to [[500,1000]], which is a list containing 1 element, which is itself a list containing 2 elements. An Input Multiplier that evaluates to [3,4,7,8] would run the Stream 4 times. Because Input Multipliers are evaluated first in the PhixFlow Timing Cycle they are often used to look up values that can be passed to Database or other Collectors.
Output Multiplier	This field is an expression which should evaluate to an array of values. A separate output record will be produced for each value in the array and this value is available as _outputMultiplier in each of the stream output attribute expressions (each value in the array is also available through _type, although this is not recommended usage). In effect this will multiply each of the output records by the number of elements from the returned list. For example : ifNull(in.ASSET, [1,10,12] , // else do [5,7] ) will create 3 records for every record in the stream if in.ASSET contains a value (setting _type = 1, 10 and 12 in each case). Otherwise it will create 2 records for every record in the stream (and set _type = 5 and 7). An Output Multiplier may also evaluate to a record, or a group of records. For example an Output Multiplier with the expression: do( lookup(lkin, $num = in.BNumber), lkin ) will return a list of records which match the lookup on the lkin pipe. In this case the required data can be extracted from the Output Multiplier using the following expression : do ( $values = _type, $values.account_num ) If the output multiplier expression evaluates to _NULL, an empty list of values or an empty list of records then a single output record will be produced with _type set to _NULL, _NULL or an empty record respectively.
Output Filter	This field is an expression which should evaluate to true or false (equivalently 1 or 0). Records created as output from the stream function can be filtered before they are written to the stream. Any attribute of the output record can be used in the expression. If an output filter expression is provided then the output record will only be written to the database if the expression evaluates to true or 1. A common pattern for example is to have an attribute on the output record (for example called 'keep') which evaluates to 1 if you wish to keep the record and 0 if you wish to discard it. The output filter expression is then _out.keep.
Actions
A list of the stream actions on the stream. See Action.
Views
A list of the views on the stream. See Stream View.
Sort Orders
A list of the sort orders on the stream. See Stream View Order.
Filters
A list of the filters on the stream. See Filters on Data Views. Any filter defined on the stream may appear in the dropdown list of filters accessible from the header of each stream view. To make a filter available in a view, the filter must be added to the list of filters for that view. See Stream View for details. All filters defined on this tab will be available on the system generated Default View for this Stream.
Inputs
A list of pipes into the stream. It is possible for this list to include pipes that have no input. This occurs if the source stream has been deleted, or if a model has been moved to a different PhixFlow instance (export/import), leaving behind a referenced stream. Any pipes with no input are highlighted in yellow. To resolve pipes with no input you can: recreate the missing stream import the missing stream keep the connection, if it will be restored when the model is moved to a different PhixFlow instance delete the pipe, if it is no longer required.
Archive Settings
Keep for X Days	The number of days data to keep in the stream. When an archive task runs for a stream, all stream data is deleted if it is at least Keep for X Days old or if it is older than the Keep for Y Stream Sets most recent valid stream sets. If both Keep for X Days and Keep for Y Stream Sets are set, stream data will be deleted only if it meets both conditions. If neither are set, stream data is kept indefinitely. If Save Archive to File is ticked, deleted items are first saved to archive files. The age of data in a stream set is its 'to' date relative to the 'to' date of the newest valid stream set in the stream. See here for how to set up and schedule an Archive Task. Please see the section below on Archiving Examples to see how this value can be used within Archiving strategies.
Keep for Y StreamSets	The number of stream sets data to keep in the stream. See Keep for X Days for the main description of archiving.
Keep Superseded for X Days	The number of days for which to keep superseded data in the stream. If Track Superseded Data is ticked, then this field will become visible/enabled. In a stream where the superseded date is tracked, the stream data will contain a mixture of superseded records and "active" records - that is, records that have not been superseded. When an archive task runs for a stream, records that were marked as superseded more than Keep Superseded for X Days days or more than Keep Superseded for Y Stream Sets stream sets ago are deleted. If both Keep Superseded for X Days and Keep Superseded for Y Stream Sets are set, superseded records will be deleted only if they meet both conditions. If neither are set, superseded records are not deleted. This means, for example, that if you have set Keep Superseded for X Days to 4, you will be able to roll back 3 days, making the 4th day the latest valid day. If Save Archive to File is ticked, deleted items are first saved to archive files. Please see the section below on Archiving Superceded Examples to see how this value can be used within Archiving strategies.
Keep Superseded for Y StreamSets	The number of stream sets for which to keep superseded data in the stream. If Track Superseded Data is ticked, then this field will become visible/enabled. See Keep Superseded for X Days for the main description of archiving superseded records.
Save Archive to File	If Save Archive to File is set, archived data will be written to compressed archive files before being deleted.
Apply Archive Filter	If this flag is ticked then a dialog box appears within which a filter can be created. This filter will be applied during archiving and only the records which match the filter will be archived and deleted.
Access Permissions
All Users Can View Data	If checked, this specifies that all users can view this data by default (provided they have the basic privilege to view streams). If this field is not checked, then access to the underlying data is controlled by dropping user groups onto the stream's "User Group" tab. Note that the default setting for this field on streams is controlled by the system parameter allowAccessToDataByDefault.
Analysis Models
A list of the analysis models that this stream appears on.
Advanced
Advanced Properties	The advanced properties field should only be set by, or under the guidance of, PhixFlow support.
Index Scheme	This determines how indexes on the Stream are organised. There are two possible settings: All: indexes on the Stream are optimised for selecting from all stream sets (non-historied reads). Latest: indexes on the Stream are optimised for selecting from the latest stream set (i.e. for historied reads). Superseded: indexes on the Stream are optimised for self-updating streams which have a moslty superseded records. None: no indexes are created on the Stream.
Storage Type	Specifies how data for the Stream should be stored: Database: Store the data in a regular table within the PhixFlow database. This is the most common option Database (Partitioned): Store the data in a partitioned table within the PhixFlow database. This option provides improved performance for rollback and archiving of very large Streamsets. The option is only available if "partitioning" is available within your database installation. In Memory: Data for the Stream will not be written to the database. This option can be used (for example) when you want to aggregate large amounts of unsorted data which can then be written to a stored Stream.
Start Date	The date that this stream starts. Data will be populated into the stream from this date onwards.
In Memory Cache Size	The size of the cache that will be maintained when you are using an in-memory stream.
Allow Partial Set Processing	If ticked, when analysis reaches the end of a buffer block it submits the candidate set for processing, even if the next buffer block has a different key.
Prevent Parallel Processing	This field only appears if the Period is set to Transactional. If ticked, it ensures that only a single stream set can be generated at a time even if the stream receives several concurrent requests to generate data. This can be useful where you want to make sure that two analysis runs don't attempt to update the same records at the same time e.g. as a result of two people selecting the same records in a view and then hitting the same action button at the same time to process those records.
Run Alone	If this flag is ticked then whenever the analysis engine needs to generate data for this stream it will first wait for all running tasks to complete before it starts. Any additional analysis tasks submitted while this stream is waiting to start, or while it is generating data, will wait until this stream has completed its analysis before they start.
Key Tolerances	When building a candidate set, data can be grouped together using a specific key value, e.g. Account Number. If the key value is a number, setting a key tolerance will identify numbers within the given tolerance as the the same key value.
Write in Single Transaction	If this is ticked, all of the records for a Stream Set will be written to the database in a single transaction.
Maximum Records to Write	The maximum number of records per Stream Set if Write in Single Transaction is ticked.
Default View	The default view selected for the stream. See help on Views for details of creating views on streams.
Last Run Date (Read only)	The date and time that analysis was last run for this stream. This date is taken from the "to date" of the most recent stream set for this stream.
Last Run By	The user that last ran this stream.
Description
Description	A free text description field for the stream.

Archiving Examples for Full Stream Sets

The table below assumes the stream to be archived currently contains 8 stream sets. Two from the current day and one from each of the previous 6 days.
In the table below the value null refers to the fact that no value has been entered into this field.
Note that archiving will always retain the maximum active stream sets in the data such that no conflicting stream sets will be archived.

Archive After X Days	Keeping Latest Y StreamSets	Resulting Streams Archived/Retained
null	null	No stream sets will be archived.
0	null	All stream sets will be archived
1	null	The last day of valid steam sets will be retained. All earlier stream sets will be archived. In our example the 2 latest stream sets will be retained with the earliest 6 stream sets archived.
X	null	All stream sets which are older than X days before the latest valid stream set will be archived.
null	0	All stream sets will be archived
null	1	The last valid stream set will be retained, all other stream sets will be archived
null	Y	The most recent Y valid stream sets will be retained, all others stream sets will be archived.
0	0	All stream sets will be archived
0	1	The last valid stream set will be retained, all other stream sets will be archived
1	0	The last day of valid steam sets will be retained. All earlier stream sets will be archived.
1	1	The last day of valid steam sets will be retained regardless of if there are more than 1. If there are no stream sets in the last day then the first previous stream set will be retained instead.
X	Y	Will retain the maximum active stream sets in the data such that no conflicting stream sets will be archived. If X=3,Y=6 then although X says only archive stream sets more than 3 days old, we must keep a Y minimum of 6 stream sets. Hence the earliest 2 stream sets will be archived and the 6 latest retained. If X=3,Y=1 then although Y says only retain 1 stream set, we must retain all stream sets less than X (3) days old. Hence the earliest 4 stream sets will be archived and the 4 latest retained.

Archiving Examples for Superceded Stream Sets

In the case where only the Keep Superseded for X Days and Keep Superseded for Y StreamSets fields are populated, the same logic in the table above will apply to the superseded records. Note that again archiving will always retain the maximum superseded stream sets in the data such that no conflicting stream sets will be archived.

In the cases where a mixture of the full archive fields Keep for X Days, Keep for Y StreamSets' and the superseded archive fields Keep Superseded for X Days, Keep Superseded for Y StreamSets are populated, then the full archive values will be first applied and the resultant stream item records will be archived and deleted. Only then will the remaining stream sets use the Keep Superseded ... values to apply a further condition to archive and delete any remaining non qualifying superseded records.

Attribute Types

Bigstring

A Bigstring is used for strings over 4000 characters long. Bigstring is a different data type to string and has some restrictions on filtering, sorting and aggregation.

For instances using an Oracle database, Bigstrings cannot be sorted or aggregated. On Oracle Bigstrings may only be filtered with the conditions (not) contains, is (not) null, (not) starts with or (not) ends with.

The maximum Bigstring size can be configured in System Configuration.

Decimal

Decimal is a non-integer number which is stored to a set level of precision. Decimals have a

significant figures property, which is the number of digits stored,
a decimal places property, which is the number of digits after the decimal place.

The maximum number of integer digits is therefore significant figures minus decimal places. If the number of integer digits is greater than the limit, analysis will fail. Decimal places will be stored to the scale specified.

By default, decimals have 10 significant figures and 2 decimal places, and therefore 8 integer digits.