Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This topic is for data modellers, application designers and system administrators. The pages in this topic explain how to create and run automated tasks in PhixFlow. It also explains strategies for tasks that affect streams. 

The pages in this topic are:

Child pages (Children Display)
depth4


Panel
borderColor#7da054
titleColorwhite
titleBGColor#7da054
borderStylesolid
titleSections on this page

Table of Contents
maxLevel3
indent12px
stylenone


Overview

When working with data, applications and IT systems, there are routine processes that you need to run. PhixFlow makes it easy for you to set up and manage these processes using tasks and task plans. For example:

  • application designers or system administrators can automatically update data in applications
  • data modelers can run analysis on models or specific streams 
  • administrators can clear old information from the PhixFlow system, by archiving data from:
    • stream sets
    • processing logs
    • performance statistics.

Use the task plan properties tab to configure Task Plans. Set the options and add individual Tasks.

If you need to temporarily prevent scheduled task plans from running during system maintenance, see Preventing Task Plans From Running.

Insert excerpt
Task Plans
Task Plans
nopaneltrue

Structuring Tasks

When setting up task plans with tasks that affect streams, you must consider whether the streams have any dependencies. This affects your options when structuring tasks in task plans.

Either each task affects one stream

If there are dependencies between streams, configure PhixFlow to have one task per stream and configure each task to be mandatory. 

  1. Create a set of tasks.
    • Each task works on a single stream.
    • Set the tasks to be mandatory.
  2. Add the set of tasks to the task plan.

PhixFlow processes the tasks, and therefore the streams, in order. PhixFlow must finish a task successfully before it starts to update the next stream. This is the safest way to set up tasks that affect streams. Always use this method when there are dependencies between streams.

Panel
borderColor#01cff1
borderWidth4
titleBGColor#01cff1
titleExample

A model contains connected streams that should be run twice daily to update a model with the latest data from an external database. A scheduled task plan contains:

  • TaskA to run stream-a
  • TaskB to run stream-b
  • TaskC to run stream-c.

PhixFlow will process TaskA until changes to stream-a are finished. As this now has the latest data, when TaskB runs, it pulls the new data into stream-b. Finally TaskC pulls new data from stream-b into stream-c.


Or one task affects multiple streams

If there are no dependencies between streams, you can add multiple streams to the task. In this case you have no control over the order in which streams will be processed. Only use this method if there is you are sure there are no dependencies between the streams.

  1. Create a task
  2. Add multiple streams to a task
  3. Add the task to the task plan. 

This has the advantage that you only need to configure one task to change multiple streams.

Panel
borderColor#01cff1
borderWidth4
titleBGColor#01cff1
titleExample

Three streams, a, b and c are in different models and have not data dependencies. A task plan to update the stream data contains a single task which runs analysis on stream-a, stream-b and stream-c. PhixFlow processes all the streams concurrently.

Mandatory Tasks

The tasks are run in the order they are listed in the the task plan.

You can specify that a task in a task plan is mandatory This means the task must complete before the next task in the sequence can run. If a mandatory task fails, PhixFlow will not run the following tasks. You must resolve the issue preventing the task from running and then restart the task.

Note

Rerunning a task plan that previously failed means it continues that run from where it failed.

If data related to an previous task has changed this is not reflected in the results of the rerun. 

This means that after you have rerun a previously failed task, consider then running the task again, to ensure all data or processes are up-to-date.

Anchor
task-type
task-type
Types of Task

  • Anchor
    analysisTask
    analysisTask
    Analysis Tasks
    Use an analysis task to run analysis on the stream(s) in the task. 

  • Anchor
    rollbackTask
    rollbackTask
    Rollback Tasks
    Use a rollback task to effectively undo run analysis on a stream.  When you run a rollback task it rolls back all data in each of the listed streams, deleting the stream-sets. The list of stream sets is empty and there are no data records in the streamFor information about how to rollback streams manually, see Rollback Window.

  • Anchor
    configExportTask
    configExportTask
    Configuration Export Tasks
    Use a configuration export task to export selected applications and packages or a full configuration. The exported file is saved to the download area; see Using the Download Area

  • Anchor
    systemTask
    systemTask
    System Tasks
    Use a system task to perform system-wide housekeeping activities. These include deleting:

    • managed file entries
    • email entries
    • temporary files created by file exporters to send by email
    • data from incomplete stream sets
    • log messages and optionally archiving them
    • from the PhixFlow database:
      • stream views for which there is no stream
      • stream tables for which there is no stream, when they are older than the period set in System Configuration → Delete Orphaned Stream Table after Days. 
        Stream tables become orphaned when their stream is deleted or removed by an import process.
      • temporary tables created during publishing, when they are older than the period set in System Configuration → Delete Temp Tables after Days.
        Temporary tables are created by PhixFlow during the publishing process.
    Note

    The system task should be run regularly, for example daily or weekly. When the system task is not run regularly, incomplete stream sets can accumulate. If these are large, or many have accummulated, this can slow down PhixFlow's performance. Depending on your PhixFlow database, queries that have to exempt many incomplete stream sets can reach system limits. This can prevent PhixFlow and its applications from running.

    How long PhixFlow keeps system information, and whether to archive log messages before they are deleted, are controlled by the following parameters in System Configuration:

  • Anchor
    archiveTask
    archiveTask
    Stream-Data-Delete Tasks
    Use a stream-data-delete task to delete data records and stream sets that you no longer need in PhixFlow. How long PhixFlow keeps stream data is configured in stream properties → Data Retention Settings.
    Running a stream-data-delete task on a large number of streams, or on a stream with a large data set, can take some time. We recommend scheduling these tasks for times when the system is quiet, for example overnight.
    When streams are deleted from the repository, their names are automatically be removed from tasks that refer to them. This means a saved task can become empty.

    Note

    Phixflow only runs a stream-data-delete task on streams that have Data Retention Settings specified. If no data retention values are set, PhixFlow never deletes the data. Stream data will accumulate, leading to space and performance issues.

    See also Managing Stream Data

    Anchor
    disabledisable

Preventing Task Plans From Running 

During upgrade and maintence, you may want to run your PhixFlow instance whilst preventing task plans from running. There are two options you can use to achieve this:

Disabling Scheduled Task Plans

You can disable scheduled task plans:

  • from within PhixFlow, using the System Configuration → Advanced → Disable Scheduled Tasks 
  • without logginging into PhixFlow, using the server.properties file disableScheduledTasks=false option.
Warning

Only disable scheduled task plans if it is essential. Restore scheduled task plans to being able to run as soon as possible. 

Task plans do not run retrospectively when the option is cleared. You must manually start any tasks plans that missed their scheduled start time.

From Within PhixFlow

To
  • disable

all scheduled task plans, in system configuration tick Disable Scheduled Tasks.  As long as this option is ticked scheduled task plans will not start. Scheduled task plans that are already running are not affected.

To allow scheduled task plans to run again, untick Disable Scheduled Tasks.

Before Restarting PhixFlow

To
  • disable

all scheduled task plans:
  1. Edit the file tomcat\webapps\<phixflow installation directory>\WEB-INF\classes\server.properties 
  2. Set disableScheduledTasks=true
  3. Restart PhixFlow. 
    • Disable Scheduled Tasks is automatically ticked.
    • Scheduled task plans do not run.

Setting server.properties disableScheduledTasks=true whilst PhixFlow is running has no effect. PhixFlow must be restarted. Once PhixFlow is running, it uses Disable Scheduled Tasks to decide whether or not to run scheduled task plans. 

To allow task plans to run again:

  1. Set the server.properties file disableScheduledTasks=false
    You must do this because:
    • clearing the Disable Scheduled Tasks check box does not change the setting in server.properties file
    • the next time the server restarts, scheduled task plans will run as normal. 
  2. Log into PhixFlow.
  3. In system configuration untick Disable Scheduled Tasks
    This restarts the scheduled task plans.
NoteSetting disableScheduledTasks=false then restarting the server does not untick Disable Scheduled Tasks.