Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page is for data modellers or administrators who need to manage stream data retention and deletion.

How to Configure Stream Data Retention and Deletion.

When a data modeller creates a stream they should set up the Data Retention Settings to specify the:

  • number of days to keep stream sets and superseded records
  • number of stream sets or stream sets with superseded records to keep.

The older data is deleted when you run a stream-data-delete task that acts on the stream; see 

When setting up stream-data-delete task, it must have:

  • either one or more stream added, for example:
    • to manage the data in several related streams
    • to run frequently on a stream that contains a large amount of data.
    • to run occasionally to manage streams with low volumes of data that change rarely.
  • or have the All Streams option ticked. The task will run on all streams:

Examples

Anchor
archiveExamples
archiveExamples
Full Stream Sets

The table below assumes the stream to be archived currently contains 8 stream sets. Two from the current day and one from each of the previous 6 days.
In the table below the value null refers to the fact that no value has been entered into this field.
Note that archiving will always retain the maximum active stream sets in the data such that no conflicting stream sets will be archived.

Archive After X DaysKeeping Latest Y StreamSetsResulting Streams Archived/Retained
nullnullNo stream sets will be archived.
0nullAll stream sets will be archived
1nullThe last day of valid steam sets will be retained. All earlier stream sets will be archived.
In our example the 2 latest stream sets will be retained with the earliest 6 stream sets archived.
XnullAll stream sets which are older than X days before the latest valid stream set will be archived.
null0All stream sets will be archived
null1The last valid stream set will be retained, all other stream sets will be archived
nullYThe most recent Y valid stream sets will be retained, all others stream sets will be archived.
00All stream sets will be archived
01The last valid stream set will be retained, all other stream sets will be archived
10The last day of valid steam sets will be retained. All earlier stream sets will be archived.
11The last day of valid steam sets will be retained regardless of if there are more than 1.
If there are no stream sets in the last day then the first previous stream set will be retained instead.
XY

Will retain the maximum active stream sets in the data such that no conflicting stream sets will be archived.

If X=3,Y=6 then although X says only archive stream sets more than 3 days old, we must keep a Y minimum of 6 stream sets. Hence the earliest 2 stream sets will be archived and the 6 latest retained.

If X=3,Y=1 then although Y says only retain 1 stream set, we must retain all stream sets less than X (3) days old. Hence the earliest 4 stream sets will be archived and the 4 latest retained.


Anchor
archiveSupercededExamples
archiveSupercededExamples
Superseded Stream Sets

In the case where only the Keep Superseded for X Days and Keep Superseded for Y StreamSets fields are populated, the same logic in the table above will apply to the superseded records. Note that again archiving will always retain the maximum superseded stream sets in the data such that no conflicting stream sets will be archived.

...