Deleting Old Data
Overview
Every time you run analysis on an analysis model, data is processed to create a new recordset. By default all previous recordsets are retained. It is important to configure the criteria PhixFlow uses to delete old recordsets that are no longer required.
Configuring Data Retention or Deletion
When a data modeller creates a table, the properties include the Data Retention Settings. The data modeller can specify the:
- number of days to keep recordsets and superseded records
- number of recordsets, or recordsets with superseded records, to keep.
The older data is deleted when you run a table-data-delete task that acts on the table; see Using Tasks and Task Plans and Task.
You can set up a table-data-delete task with:
- either one or more specific tables. You may choose to do this, for example:
- to manage the data in several related tables
- to run frequently on a table that contains a large amount of data.
- to run occasionally on tables with low volumes of data that change rarely.
- or have the All Tables option ticked. The task will run on all tables:
- that are not in another table-data-delete task
- and have Data Retention Settings configured.
Managing Recordsets Example
The following table shows the different combinations of settings in the table properties → Data Retention Settings. It assumes that a table currently contains 8 recordsets:
- 2 from the current day
- 1 from each of the previous 6 days.
The values are:
- N: a number of days
- X: a number of recordsets
null: indicates no value has been entered for this option.
PhixFlow always retains the maximum number of active and superseded recordsets in the data, so that no conflicting recordsets will be deleted.
Delete After X Days | Keeping Latest Y Recordsets | Action taken when table-data-delete task runs |
---|---|---|
null | null | All recordsets are retained. |
0 | null | All recordsets are deleted. |
1 | null | The last day of valid recordsets are retained. All earlier recordsets are deleted. In our example the 2 most recent recordsets will be retained with the 6 older recordsets deleted. |
N | null | All recordsets older than N days before the latest valid recordset are deleted. |
null | 0 | All recordsets are deleted. |
null | 1 | The most recent valid recordset is retained, all other recordsets are deleted. |
null | X | The most recent X valid recordsets are retained, all others recordsets are deleted. |
0 | 0 | All recordsets are deleted. |
0 | 1 | The most recent valid recordset is retained, all other recordsets are deleted. |
1 | 0 | The last day of valid recordsets are retained. All older recordsets are deleted. |
1 | 1 | The last day of valid recordsets are retained regardless of if there are more than 1. If there are no recordsets in the last day, then the first previous recordset is retained instead. |
N | X | Retains the maximum number of active recordsets in the data, such that no conflicting recordsets are deleted. |
Superseded Recordsets
If only Keep Superseded for N Days and Keep Superseded for X Recordsets fields are populated, the same logic in the table above applies to the superseded records.
If a table has values for all the properties
- Keep for N Days
- Keep for X Recordsets
- Keep Superseded for N Days
- Keep Superseded for X Recordsets
then the values for recordsets are applied first, deleting full recordsets. Then the table values for superseded records are applied to delete any remaining superseded records.