Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Insert excerpt
_Banners
_Banners
nameanalysis
nopaneltrue

This page is for data modellers. It provides an introduction to streams tables and pipes.

Overview

In an analysis model, a data set dataset is represented by a streamtable. A stream table is a bit like an Excel spreadsheet, in that it contains a set of data with:

  • columns - these are the attributes
  • rows - these are the stream items, or data the data records.

In PhixFlow you connect two streams tables with a pipe. A pipe sends data from the input stream table to the output streamtable. Pipes also connect other types of modelling object, such as a datasource or file exporter. Usually, a pipe with the has default settings. This means it passes all attributes and records onto the next object. However, you can use the pipe properties to control which attributes and records from the input object you want to pass through.

When you run an analysis model, PhixFlow uses information in each object's properties to process the data. This means each stream table and pipe can transform the data in the analysis model. With each analysis run, the data set in a stream table can change, so PhixFlow keeps a  snapshot for each run. The snapshot is called a record-set recordset.  If there is a problem in the analysis run, you can "undo" it delete the new recordset by rolling back the run. PhixFlow reverts the data to the selected, previous stream setrecordsetYou can also copy or move data from a stream recordset.

To look at the data in a stream table, you use a view. The default view shows data in a grid. You can also create different views such as graphs and charts. Stream view View properties have lots of options to control which attributes are included in the view, and how to sort the records.


Panel
borderColor#7da054
titleColorwhite
titleBGColor#7da054
borderStylesolid
titleSections on this page

Table of Contents
indent12px
stylenone


Pages in this topic:

Child pages (Children Display)

Other links

See also: Importing and Exporting Data

Types of Stream

There are several types of stream.

Anchorcalculatecalculate Insert excerpt_table_calculate_table_calculatenopaneltrue

Calculate streams are the most basic type. An output record will be produced for each input record.

Anchormergemerge Insert excerpt_table_merge_table_mergenopaneltrue

Merge streams combine sets of input data. In each input pipe a grouping is defined, and an output record is produced for each key value combination that is produced by this grouping applied across all inputs.

Anchoraggregateaggregate Insert excerpt_table_aggregate_table_aggregatenopaneltrue

Aggregate streams aggregate input data. In the input pipe a grouping is defined, and an output record is produced for each key value combination that is produced by the grouping.

Tip

Simple aggregations are better performed using aggregate pipes.

Aggregate streams are functionally identical to merge streams, but by convention, when there is only one input, an aggregate stream is used - this displays as a Image Removed on the model view. Often this helps to clarify the purpose of the stream in the model.

AnchorcartesiancartesianCartesian Stream

Cartesian streams perform a cartesian join across all inputs. Although this can be useful in some cases, mostly it is easier and simpler to multiply output records with either an output multiplier - which can be configured for any stream type - or to use a multiplier pipe.

AnchorbysetbysetCalculateBySet Stream

Calculate by Set streams are like calculate streams in that an output record is produced for each input record. But in addition a grouping can be configured on the input pipe which allows, for each record processed, related rows to be included in calculations.

For details about when to use each type of streams, see Which Stream to Use and the linked pages.


Streams and Time Periods

In any given moment, a streams contains a set of records. This is the record-set. You can set the time period over which PhixFlow collects records. The period can be:

Insert excerpt
Table Properties
Table Properties
nopaneltrue

For non-transactional periods, PhixFlow checks for incomplete record-sets and reports an error if it finds them. However, pipes from transactional streams allow incomplete record-sets, as data is constantly changing.

Publishing Streams
Anchor
publish
publish

When you make changes to a stream's properties or its attributes, PhixFlow publishes the changes to the PhixFlow database. This happens automatically in the background. Publishing many streams or streams with many attributes can take some time, and may slow performance.

If the stream properties are set incorrectly, PhixFlow will not be able to publish the stream to the database. If this happens, the

Insert excerpt
_console
_console
nopaneltrue
 will report the publishing error. PhixFlow will also display an error message if you try to interact with the stream, for example to view its data or to run analysis. You must correct the stream properties, so that PhixFlow can retry publishing the stream. 

During the publishing process, PhixFlow may create temporary streams in its database. These are kept for a period, then automatically removed when a system task runs. For information about:

Insert excerpt
_publishing_space
_publishing_space
nopaneltrue