Analysis Models for Batch Processing Data

Analysis Models Overview

Analysis models are the graphical representation of the data processing that will be carried out by PhixFlow. Designed to handle large data sets, analysis models define how data is imported, transformed, viewed, analysed and exported. Each task is performed by a modelling object. For example, a  Datasource is an object that allows PhixFlow to connect to an external database, and a  Table allows you to write functions to manipulate data, such as calculating a value.

Running Analysis

Analysis models can be run in one of the following ways:

  1. Run Analysis: An ad-hoc request from a model, initiated by clicking   Run Analysis in a table's popup toolbar.
  2. Scheduling a task: Initiated by a  Task Plan.
  3. Triggered by an ActionFlow: Actionflows can call an analysis model. 

As PhixFlow runs the analysis model, the steps are recorded in the System Console. Open this up at any time by clicking  Administration, then   System Console.

Analysis Model Window Layout

 More Detail
  1. The Toolbar provides screen-specific options, such as displaying existing tables which can be added to the canvas and the option to create new tables or file collectors.
  2. The Repository and Property Editors are displayed to the right of the canvas. They provide navigation to locate existing items, the ability to create new items and access to the options for each editable item.
  3. The Canvas area is where the Analysis Model will be configured. New items, such as Tables and File Collectors, can be added to the canvas by dragging on the associated toolbar icon, existing items are added by dragging the items from the repository onto the canvas.
  4. Modelling Objects, such as Tables and Datasources, appear on the canvas and are connected by pipes. Pipes perform different roles, such as allowing data to flow through (solid lines) or performing lookups (dashed line) to retrieve additional information, such as looking up a product code and returning its name.

Importing Data

PhixFlow supports a range of methods for importing data including files, emails, databases and APIs. See 2. Importing Data.

Transforming Data

Transforming and enriching data is at the heart of analysis modelling, from looking up reference information to performing fuzzy logic deduplication of customers. There are a host of options and strategies, but begin by looking at Transforming Data.

Transformations are performed using a combination of modelling objects, such as tables, and Functions within those objects, such as replaceAll.

Enriching data is also extremely useful, such as structuring unstructured address data into individual address lines or deriving additional information, such as deriving asset categories based on information found in a description. See Enriching Data.

Reconciliation can be performed in analysis models, from simple master data checks, which provide details of records processed vs records output with supporting information, through to transactional reconciliation, where calculations are performed on data to ensure the results of the processed data matches the expected result. Reconciliation is particularly useful for data migrations to validate the accuracy of the data moved. See Reconciliation

Lookups enable data to be read from different tables for the purposes of enriching another. For example, it could be to check the status of an order. For more information on looking up data from other tables, see Enriching Data → Lookup Information.


Performance

Performance is key when handling large data sets, therefore PhixFlow provides a number of features to assist in the area of performance. These include caching data, memory lookups and indexing. See Performance and Tuning.

Modelling Objects

Modelling Objects, such as Tables and Datasources, appear on the canvas and are connected by pipes.

Pipes perform different roles, such as allowing data to flow through (solid lines) or performing lookups (dashed line) to retrieve additional information, such as looking up a product code and returning its name.

See Analysis Properties.

Candidate Sets

Candidate sets are a fundamental concept of function calculation in PhixFlow.

Every time a function calculation is carried out, all the required input data is brought together and organised into sets of data - one set for each Key group.

The Key groups are worked out using the Pipe Grouping Attributes defined on the input Pipe for each table.

Recordsets

Each time you run analysis on a model, PhixFlow creates a new set of data in each table; see Table. A recordset is a collection of data within a table for a given period. Recordsets, and the data they contain, remain in the table until you archive the data (see Task) or manually remove it (see Rollback Recordsets)

Within analysis models, all data is processed before commencing to the next modelling object. This is in contrast to actionflows which process each record in their entirety before moving to the next one.

Periods

The time period of a table determines how data in the table will be handled. The period is typically set to:

  1. Transactional: allows multiple users to run independent analysis tasks at the same time.
  2. Variable: generate or collect data since the most recent run of the table to the current date.

Data Range

The data range determines the recordset that will be displayed:

  • Latest displays the records from the most recent recordset only.
  • All displays the records from every recordset.

Rolling back

To remove data, the record set must be rolled back, see Rollback Recordsets.

Exports

PhixFlow supports a range of methods for exporting data, including files, direct writes to databases and APIs. Details of each can be found in Exporting Data from an Analysis Model.

Scheduling

When working with data, applications and IT systems, there are routine processes that you need to run on a schedule. PhixFlow makes it easy for you to set up and manage these processes using  Task Plans, to which you add tasks. See Task Plans.

Example Analysis Model

 View Example

Click on the image below to see a larger example of an analysis model. This particular example takes business data from multiple locations and merges them into a single refined data set:

What Next

The PhixFlow Fundamentals course provides a practical guide to using PhixFlow, including analysing and transforming data using Analysis Models.

Already started PhixFlow Fundamentals?

Return to Analysis Fundamentals

Further Reading