Deleting Old Data

Overview

Every time you run analysis on an analysis model, data is processed to create a new recordset. By default all previous recordsets are retained. It is important to configure the criteria PhixFlow uses to delete old recordsets that are no longer required.

Configuring Data Retention or Deletion

When a data modeller creates a table, the properties include the Data Retention Settings. The data modeller can specify the:

  • number of days to keep recordsets and superseded records
  • number of recordsets, or recordsets with superseded records, to keep.

The older data is deleted when you run a table-data-delete task that acts on the table; see Using Tasks and Task Plans and Task.

You can set up a table-data-delete task with:

  • either one or more specific tables. You may choose to do this, for example:
    • to manage the data in several related tables
    • to run frequently on a table that contains a large amount of data.
    • to run occasionally on tables with low volumes of data that change rarely.
  • or have the All Tables option ticked. The task will run on all tables:
    • that are not in another table-data-delete task
    • and have Data Retention Settings configured.

Managing Recordsets Example

The following table shows the different combinations of settings in the table properties → Data Retention Settings. It assumes that a table currently contains 8 recordsets:

  • 2 from the current day 
  • 1 from each of the previous 6 days.

The values are:

  • N: a number of days
  • X: a number of recordsets
  • null: indicates no value has been entered for this option.

PhixFlow always retains the maximum number of active and superseded recordsets in the data, so that no conflicting recordsets will be deleted.

Delete After X DaysKeeping Latest Y RecordsetsAction taken when table-data-delete task runs
nullnullAll recordsets are retained. 
0nullAll recordsets are deleted.
1nullThe last day of valid recordsets are retained. All earlier recordsets are deleted.
In our example the 2 most recent recordsets will be retained with the 6 older recordsets deleted.
NnullAll recordsets older than N days before the latest valid recordset are deleted.
null0All recordsets are deleted.
null1The most recent valid recordset is retained, all other recordsets are deleted.
nullXThe most recent X valid recordsets are retained, all others recordsets are deleted.
00All recordsets are deleted.
01The most recent valid recordset is retained, all other recordsets are deleted.
10The last day of valid recordsets are retained. All older recordsets are deleted.
11The last day of valid recordsets are retained regardless of if there are more than 1.
If there are no recordsets in the last day, then the first previous recordset is retained instead.
NX

Retains the maximum number of active recordsets in the data,  such that no conflicting recordsets are deleted.

If N=3 and X=6 then although N says only delete recordsets more than 3 days old, we must keep a minimum of 6 recordsets. Hence the older 2 recordsets will be deleted and the 6 most recent ones are retained.

If N=3 and X=1 then although X says only retain 1 recordset, we must retain all recordsets less than 3 days old. Hence the 4 oldest recordsets are deleted and the 4 most recent are retained.


Superseded Recordsets

If only Keep Superseded for N Days and Keep Superseded for X Recordsets fields are populated, the same logic in the table above applies to the superseded records.

If a table has values for all the properties

  • Keep for N Days
  • Keep for X Recordsets
  • Keep Superseded for N Days
  • Keep Superseded for X Recordsets 

then the values for recordsets are applied first, deleting full recordsets. Then the table values for superseded records are applied to delete any remaining superseded records.