Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page is for data modellers, application designers and system administrators. It explains how to create a task and the different types of task. It also explains strategies for tasks that affect streams.

Overview

When working with data, applications and IT systems, there are routine processes that you need to run. A task is a specific job, often linked to a stream. You do not run tasks directly. You must add them to a task plan; see Task Plans.

The task properties tab is not available from the repository. It opens from the Task Plans properties tab.

See Also


Panel
borderColor#7da054
titleColorwhite
titleBGColor#7da054
borderStylesolid
titleSections on this page

Table of Contents
indent12px
stylenone


Creating a Task

  1. In a task plan propertiesproperties  TasksTasks section, click 
    Insert excerpt
    _add
    _add
    nopaneltrue
  2. Select one of the task types from the menu; see 9711227558, below see Types of Task.
  3. PhixFlow opens a new task properties tab.
  4. Enter a name in Basic Settings → Name.
  5. Specify the order in which the task must be run. This is important where if a stream depends on the data in other streams.
  6. Insert excerpt
    _save
    _save
    nopaneltrue
    .
  7. Optionally set other properties and add streams; see 9711227558, below.

Mandatory Tasks

The tasks are run in the configured order. 

You can specify that a task in a task plan is mandatory This means the task must complete before the next task in the sequence can run. If a mandatory task fails, PhixFlow will not run the following tasks. You must resolve the issue preventing the task from running and then restart the task.

Note

Rerunning a task plan that previously failed means it continues that run from where it failed.

If data related to an previous task has changed this is not reflected in the results of the rerun. 

This means that after you have rerun a previously failed task, consider then running the task again, to ensure all data or processes are up-to-date.

Structuring Tasks

When setting up task plans with tasks that affect streams, you must consider whether the streams have any dependencies. This affects your options when structuring tasks in task plans.

Either each task affects one stream

If there are dependencies between
  1. streams
, configure PhixFlow to have one task per stream and configure each task to be mandatory. 
  1. Create a set of tasks.
  2. Each task works on a single stream.
  3. Set the tasks to be mandatory.
  4. Add the set of tasks to the task plan.

PhixFlow processes the tasks, and therefore the streams, in order. PhixFlow must finish a task successfully before it starts to update the next stream. This is the safest way to set up tasks that affect streams. Always use this method when there are dependencies between streams.

Panel
borderColor#01cff1
borderWidth4
titleBGColor#01cff1
titleExample

A model contains connected streams that should be run twice daily to update a model with the latest data from an external database. A scheduled task plan contains:

  • TaskA to run stream-a
  • TaskB to run stream-b
  • TaskC to run stream-c.

PhixFlow will process TaskA until changes to stream-a are finished. As this now has the latest data, when TaskB runs, it pulls the new data into stream-b. Finally TaskC pulls new data from stream-b into stream-c.

Or one task affects multiple streams

If there are no dependencies between streams, you can add multiple streams to the task. In this case you have no control over the order in which streams will be processed. Only use this method if there is you are sure there are no dependencies between the streams.

  1. Create a task
  2. Add multiple streams to a task
  3. Add the task to the task plan. 

This has the advantage that you only need to configure one task to change multiple streams.

Panel
borderColor#01cff1
borderWidth4
titleBGColor#01cff1
titleExample

Three streams, a, b and c are in different models and have not data dependencies. A task plan to update the stream data contains a single task which runs analysis on stream-a, stream-b and stream-c. PhixFlow processes all the streams concurrently.

Types of

Task

AnchoranalysisTaskanalysisTaskAnalysis Tasks

Use an analysis task to run analysis on the stream(s) in the task. 

AnchorrollbackTaskrollbackTaskRollback Tasks

Use a rollback task to effectively undo run analysis on a stream.  When you run a rollback task it rolls back all data in each of the listed streams, deleting the stream-sets. The list of stream sets is empty and there are no data records in the stream (tbc). For information about how to rollback streams manually, see Rollback.

AnchorarchiveTaskarchiveTaskArchive Tasks

Use an archive task to delete or archive stream data or stream sets that you no longer need in PhixFlow. Archive data is saved into a zip file - to get at this? You can (in theory) reinstate archive data.

Whether or not PhixFlow deletes or archives stream sets depends on how the archive settings in the stream are configured. 

Note

Phixflow can only run an archive task on streams that have Archive Settings specified. If a stream does not have any archive settings, PhixFlow never deletes or archives the data. Stream sets will accumulate, leading to space and performance issues.

Running an archive task on a large number of streams, or on a stream with a large data set can take some time. We recommend scheduling archiving tasks for times when the system is quiet, for example overnight.

Archiving Specific Streams

You can create an archiving task for a specific stream or streams. PhixFlow creates a single zip file containing the specified streams.  For example, you might want to create a single archive for:

  • data from several related streams
  • a stream that contains a large amount of data  
  • a stream that you want to be able to restore from the backup. This is easier when the data is in its own zip.

Archiving Any Stream

You can create a single archive task to delete or archive the data in streams:

  • that are not in another archive task
  • and have archiving settings configured.

To do this, select the Archive All option

When you run the task, PhixFlow:

  • deletes stream sets that are older than the specified archive settings in the stream
  • creates a zip archive of all the stream sets within the specified archive settings in the stream.
    This is a single archive containing all the data from all the streams processed.

Archive tasks must have:

  • either at least one stream added
  • or have the Archive All option ticked.

If streams are deleted from the repository, their names will automatically be removed from tasks that refer to them. This means a saved task can become empty/redundant. PhixFlow will no longer run the task - log mesages??

Finding the Archive file

When a stream is archived, PhixFlow logs information in the system console →  Archive Log. This includes information about the data archived and any archive file created. This is a single archive containing all the data from the streams processed.. Optionally, the archived data can be saved to a file. If this is done, the data can reloaded into PhixFlow using Restore Archive.

Warning

This should be a warning in the Streams page.

Streams that do not have any Archive Settings will never have their data deleted or archived. This can cause large amounts of data to build up in PhixFlow, affecting performance. Remember to set your archive settings!!!

For data in a stream to be deleted or archived, the stream must have Archive Settings applied. If a stream does not have any archive settings, the data will accumulate, leading to performance issues.

AnchorsystemTasksystemTaskSystem Tasks

Use a system task to perform system-wide housekeeping activities. These include deleting:

  • managed file entries
  • email entries
  • temporary files created by file exporters to send by email
  • data from incomplete stream sets
  • log messages and optionally archiving them.
Note

The system task should be run daily, or at a minimum every week. If incomplete stream sets are not deleted, they can slow down PhixFlow's performance. Depending on your PhixFlow database, queries that have to exempt many incomplete stream sets can reach system limits. This can prevent PhixFlow and its applications from running.

The archiving periods, and whether to archive log messages before they are deleted, are controlled by the following parameters in System Configuration:

AnchorconfigExportTaskconfigExportTaskConfiguration Export Tasks

Use a configuration export task to export configuration of selected items or full configuration. The exported file will be available in the File Download Area for users in groups listed in the user groups section. The File Settings section allows you to specify the name and description for the exported configuration file. The description will appear in the file download area and also in the header of the configuration export file.

When streamsTask

Properties

Insert excerpt
_standard_settings
_standard_settings
nopaneltrue

Basic Settings

FieldDescription
NameEnter the name of the task.
Mandatory

Insert excerpt
_check_box_ticked
_check_box_ticked
nopaneltrue
 to specify that this task must succeed before attempting the next task in the task plan.

Insert excerpt
_check_box_untick
_check_box_untick
 to specify that, even if this task fails, the next task in the task plan can run.

OrderSpecify the order that the task will be run in the task plan. For example, if there are a total of 4 tasks in the task plan, and you want this task to run third, enter 3.
Task Type

PhixFlow displays the type of this task, which can be:

  • Analysis Task
  • Archive Task
  • Rollback Task
  • System Task

See Types of Task for details.

Tip

You select the task type when first add the task. This field is visible after the task is saved for the first time. You cannot change the task type. 


Archive All

Available when Task Type = is Archive Task.

Insert excerpt
_check_box_ticked
_check_box_ticked
nopaneltrue
 to delete or archive data according to a stream's Archive Settings. This applies to all streams that are not listed in any archive task.

Insert excerpt
_check_box_untick
_check_box_untick
nopaneltrue
 to apply this task to one or more streams. Add the streams in the Streams section.

To save an archive task, it must:

  • either have this option ticked
  • or have at least one stream.

Streams

This section is only available for analysis tasks and archive tasks

. The grid shows

. Use this section to specify the streams that the task affects. This section has:

  • a toolbar with standard buttons
  • a grid that shows the list of streams assigned to the task.
Tip

When you are creating a new task, remember to name and save the task properties so that the Streams section toolbar becomes available.

There are different ways you can to add streams to the list.

  • In the section toolbar, click 
    Insert excerpt
    _streams_show
    _streams_show
    nopaneltrue
    . PhixFlow shows all the streams in the repository. Drag a stream into the grid.
  • From a model, hover your mouse pointer over the stream to display the pop-up toolbar. Click 
    Insert excerpt
    _object_drag
    _object_drag
    nopaneltrue
     and drag it into the grid.
  • From the repository, click  Insert excerpt_streams_show_streams_shownopaneltrue to display the list of the streams in the repository. Drag a stream from the list into the grid.a model, double-click on a stream to open its properties. From the stream properties, drag the stream icon from the top left of the properties tab into the grid.

If you add multiple streams to the task, PhixFlow will process the streams in any order when the task runs. Only use this method if there is you are sure there are no dependencies between the streams.


Mandatory Tasks

The tasks are run in the order they are listed in the the task plan.

You can specify that a task is Mandatory. If a mandatory task fails, PhixFlow will not run the following task in the task plan. You must resolve the issue preventing the task from running and then restart the task.

Note

Rerunning a task plan that previously failed means it continues that run from where it failed.

If data related to an previous task has changed this is not reflected in the results of the rerun. 

This means that after you have rerun a previously failed task, consider running the task again, to ensure all data or processes are up-to-date.