PhixFlow Help

Managing Stream Data

This page is for data modellers or administrators who need to manage stream data retention and deletion.

How to Configure Stream Data Retention and Deletion

When a data modeller creates a stream they should set up the Data Retention Settings to specify the:

  • number of days to keep stream sets and superseded records
  • number of stream sets or stream sets with superseded records to keep.

The older data is deleted when you run a stream-data-delete task that acts on the stream; see Using Tasks and Task Plans and Task.

You can set up a stream-data-delete task with:

  • either one or more specific streams. You may choose to do this, for example:
    • to manage the data in several related streams
    • to run frequently on a stream that contains a large amount of data.
    • to run occasionally on streams with low volumes of data that change rarely.
  • or have the All Streams option ticked. The task will run on all streams:
    • that are not in another stream-data-delete task
    • and have Data Retention Settings configured.

Managing Stream Sets

The following table shows the different combinations of settings in stream properties → Data Retention Settings. It assumes that a stream currently contains 8 stream sets:

  • 2 from the current day 
  • 1 from each of the previous 6 days.

The values are:

  • N: a number of days
  • X: a number of stream sets
  • null: indicates no value has been entered for this option.

PhixFlow always retains the maximum number of active and superseded stream sets in the data, so that no conflicting stream sets will be deleted.

Delete After X DaysKeeping Latest Y StreamSetsAction taken when stream-data-delete task runs
nullnullAll streams sets are retained. 
0nullAll stream sets are deleted.
1nullThe last day of valid steam sets are retained. All earlier stream sets are deleted.
In our example the 2 most recent stream sets will be retained with the 6 older stream sets deleted.
NnullAll stream sets older than N days before the latest valid stream set are deleted.
null0All stream sets are deleted.
null1The most recent valid stream set is retained, all other stream sets are deleted.
nullXThe most recent X valid stream sets are retained, all others stream sets are deleted.
00All stream sets are deleted.
01The most recent valid stream set is retained, all other stream sets are deleted.
10The last day of valid steam sets are retained. All older stream sets are deleted.
11The last day of valid steam sets are retained regardless of if there are more than 1.
If there are no stream sets in the last day, then the first previous stream set is retained instead.
NX

Retains the maximum number of active stream sets in the data,  such that no conflicting stream sets are deleted.

If N=3 and X=6 then although N says only delete stream sets more than 3 days old, we must keep a minimum of 6 stream sets. Hence the oldes 2 stream sets will be deleted and the 6 most recent ones are retained.

If N=3 and X=1 then although X says only retain 1 stream set, we must retain all stream sets less than 3 days old. Hence the 4 oldest stream sets are deleted and the 4 most recent are retained.


Superseded Stream Sets

If only  Keep Superseded for N Days and Keep Superseded for X StreamSets fields are populated, the same logic in the table above applies to the superseded records.

If a stream has values for all the properties

  • Keep for N Days
  • Keep for X StreamSets
  • Keep Superseded for N Days
  • Keep Superseded for X StreamSets 

then the values for stream sets are applied first, deleting full stream sets. Then the stream values for superseded records are applied to delete any remaining superseded records.


Please let us know if we could improve this page feedback@phixflow.com