Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Data Held on PhixFlow

The PhixFlow system comprises various types of data / is comprised of a database and systems files stored as follows:

Database

  • Configuration and system state data
Analysis results
  • Application data
Alarm
  • PhixFlow and
task status
  • application audit trail
  • PhixFlow task logging information
  • Database logs

File system

  • PhixFlow software
  • Files transferred to PhixFlow for processing
  • Files exported by PhixFlow
Archive data exported by PhixFlow
  • PhixFlow system logs
  • Template files, image files, files uploaded to be shown in PhixFlow applications ("attachments")

Backup Options

Backup requirements depend on the level of risk that is acceptable and the role that PhixFlow fulfils:

Snapshot

In a “standalone” role in which PhixFlow does not update other business systems then a simple daily snapshot backup of the database may be sufficient. The risk associated with this approach is that in the case of a catastrophic machine failure PhixFlow will lose all new configuration data entered that day plus state information and any new analysis results. To recover from this case after a database restore, the new configuration data would need to be re-entered manually and the analysis tasks re-rerun (this is possible provided source data is still available and import files are still in their import directories).

Hot Backup

More typically PhixFlow is used to update other business systems either directly (e.g. by calling APIs or updating / inserting data via SQL) or indirectly (e.g. by passing files or creating jobs on external work queues). In this case the impact of losing a day’s worth of data would require remedial action on other systems and the risk of a simple backup is too great, PhixFlow recommend that in this case a hot backup mechanism is put in place and redo logs and archived redo logs are maintained for sufficient time to recover completely from a catastrophic machine failure. When doing a hot backup, remember to place the database into backup mode to ensure complete blocks are written to redo.

Backup Volumes

The total backup storage requirements for one week for the hot backup approach on a standard sized system (i.e. tablespace size is 100 GB) is given below:

Item

Volume

Weekly Volume

Full backup

100 GB

100 GB

Daily Redo log archive

20 GB

140 GB

 

Total

240 GB

A recommended backup strategy is to retain data as follows:

Item

Volume

Current week archive logs

140 GB

Current week full backup

100 GB

Prior week (week -1) archive logs

140 GB

Prior week (week -1) full backup

100 GB

Week -2 full backup

100 GB

Week -3 full backup

100 GB

Total

680 GB

This strategy enables the administrator to reset the database to any point within at least the last week and provide restore points for the previous 4 weeks. For larger systems, these values should be provided by the sizing process.

...

At the planning phase of the deployment, the backup options should be determined based on:

  • Recovery Time Objective (RTO): how long the recovery will take? (5 minutes / 1 hour?)
  • Recovery Point Objective (RPO): for example, point-in-time recovery (PITR) or recovery to the overnight backup. For the latter, are you OK with losing data input entered today for example?

For more details on this topic see: https://en.wikipedia.org/wiki/Disaster_recovery_and_business_continuity_auditing

Application server backups

  • Perform an application backup each time PhixFlow is upgraded

  • Or weekly application server backups to simplify the process. Deleting old backups after N weeks.

Details of the solution have an impact on the need for backup and recovery of the application server. In most cases, the only data that must be kept for the integrity of the solution is in the database. In many cases, artifacts (software, template files, image files, uploaded files) and data (imported and exported files) kept on the application server can be recovered from other sources, or are not needed for long-term retention.

Discuss this with your PhixFlow implementation team or PhixFlow support to determine exactly what you need to retain for the long term, and what the recovery needs are for the application server. 

It is usually important - and in some cases vital (e.g. corporate standards and/ or independent compliance frameworks may apply) to retain log files for a certain period of time. If you ingest your log files into an off-server log store, however, there may be no need to make provision for the backup and recovery of log files on the application server itself.

In many cases, a platform backup for the disk or server, of the type offered by many virtualisation solutions, will be sufficient, and this may not be needed as frequently as that for the database. Daily backup is typical, and not usually associated with much higher cost than a less frequent backup schedule.

Database server backups

Define the backup requirements for your RTO and RPO so that your Database Administrator (DBA) can best advise on a suitable backup strategy for your setup. For example, your requirements will influence the need for a "hot" backup (PITR)), or "cold" (Full backup daily).

Predicting "undo" and "redo" space is often difficult, and observing the solution at work is the best way of measuring the required amount of space. It is important to not over-specify the server and each partition to avoid unnecessary costs and grow these if needed once the solution has been established.

Some virtualisation platforms will support RPOs much shorter than standard snapshot/ backup options. For example, for running SQL Server on Azure: https://docs.microsoft.com/en-us/azure/backup/backup-azure-sql-database

Database backup volume planning

If you opt for database-level backups (rather than backups on a virtualisation layer), you will need to plan for additional disk storage. However, for long-term retention of these backups, you can use a lower cost disk. These sizings are only indicative, and the exact amount you need will depend on the details of your solution.

Weekly backup volume

SmallMediumLarge

"Cold" backups - i.e. full backups only

25 GB

100 GB

400 GB
"Hot" backups - redo backups for PITR

50 GB

200 GB

800 GB

Total

75 GB

300 GB

1200 GB

General recommendations

  • If you are working on a virtualisation platform, make use of the backup options available if you can - these are often more straightforward to manage, recover from, and test recovery from, than traditional database backups.
  • If you are using snapshot technology to take backups - that is, backing up disks that are not idle - choose application consistent snapshots/ backups if possible.