- Created by Former user, last modified by Fiona Sargeant (Unlicensed) on Mar 17, 2022
You are viewing an old version of this page. View the current version.
Compare with Current View Page History
« Previous Version 44 Next »
This page is for data modellers. It explains the properties for database collectors.
Overview
A database collector reads data from a Datasource. It holds the SQL that will be sent to the datasource to retrieve data. The query columns must match up to any tables saving the data.
To add a new database collector to an analysis model:
- Go to the model's toolbar → Create group.
- Click Database to expand the menu.
- Drag a Database Collector onto the analysis model.
To add an existing database collector to an analysis model, in the model diagram toolbar:
- Go to the model toolbar → List group.
- Click Database to expand the menu.
- Click Database Collector to open the list of available database collectors.
- Drag a database collector into the analysis model.
Properties Tab
Parent Details
If this item is within or belongs to another, its parent name is shown here. See the Parent Details section on the Common Properties page for more details.
Basic Settings
Field | Description |
---|---|
Name | Enter the name of the database collector. |
Datasource | Select the datasource that this database collector will collect from. To select from a list, click Show the List of Datasources. |
Statement Expression | Enter an SQL query string, which can include PhixFlow Expressions. The database collector processes the query to retrieve data from the database; see Statement Expressions, below, for details. |
Allow Non-Scheduled Collection | Tick to allow this database collector to be run as part of any ad-hoc analysis run that requires this data. Untick to restrict this database collector to being run as part of a scheduled Task Plan. |
Enabled | Tick when the configuration is complete and the database collector is ready to be used. |
Analysis Models
If this item is used by an analysis model, its name is listed here. See the Common Properties page, Analysis Model section for more details.
Advanced
Field | Description |
---|---|
Timeout | Specify a time period in seconds, after which a query will timeout. |
Datasource Instance Expression | The datasource to which this database collector is connected may list multiple database instances from which the data may be read. Each Datasource Instance is identified by a unique string. The database collector needs to know which specific datasource instance to use. Enter an expression that evaluates to a plain text string that corresponds to one of the unique datasource instances. If this expression is blank then the database collector assumes that there is only one instance to use. If there is more than one instance the database collector will be unable to determine which instance to use and PhixFlow will report an error. |
Log Traffic | You can set this option when system logging → Allow Logging is ticked; see System Logging Configuration.
To change this property, you must have the Modify System Logging Configuration privilege. Untick to prevent logging. Ticking this box has no effect when the system logging option is ticked. Tick to log details of communication. Ticking this box has no effect when the system configuration option Allow Logging is not ticked. |
Description
We recommend that you always enter a description to explain the purpose of this item.
Audit Tab
Audit Summary
See the Common Properties page, Audit Summary section.
Statement Expressions
Testing Expressions
When the database collector is enabled, the properties menu at has an option Run Query and See Result. If you cannot see the option, click on More Options to expand the full menu.
Use Run Query and See Result to test the SQL query string in the expression. PhixFlow runs the SQL and opens a table showing the first page of returned data. You can Run Query even if the database collector has not been saved.
Query Strings
The query string must return data in fields with the same name as the attributes you wish to populate. So to collect data into a table with the attributes account_number, cust_ref and eventsource from a table with fields account_number, customer_ref and eventsource, use the query string:
select account_number, customer_ref as cust_ref, eventsource from billing_products where from_dat >= to_date({_fromDate}, 'yyyymmdd') and to_dat <= to_date({_toDate}, 'yyyymmdd')
Notice that the query string directly includes PhixFlow Expressions that calculate values.
Expressions must be
enclosed in curly brackets, starting with a {
and ending with }
take the form of a standard PhixFlow Expression. Expressions can include PhixFlow's Internal Variables. In the example above, {_fromDate}
and {_toDate}
are expressions that use internal variables to reference the table's period.
If the dates include a time element, use the query string:
select account_number, customer_ref as cust_ref, eventsource from billing_products where from_dat >= to_date({_fromDate}, 'yyyymmdd.hh24miss') and to_dat <= to_date({_toDate}, 'yyyymmdd.hh24miss')
Do not include the standard sqlplus terminator. (For example, in Oracle, the terminator is a semi-colon).
Complex Replacement Expressions
You can use more complex PhixFlow expressions in queries where necessary. To do this, precede the expression (inside curly braces) with an = sign. For example
select account_number, customer_ref from account{=accBlockString(_out.BlockID)}
There are some disadvantages when using = to precede an expression, as noted below, and therefore you should only use this when really necessary. For example, where the table or column names can vary, and are controlled by input data to the query - as in the above example, where part of the name of the table being queried is returned by a Macro "accBlockString", based on an input value "_out.BlockID".
Note: when you use the = sign to precede an expression, this may result in slower performance. If you are using this to apply a data transformation within the query, it is better to apply this transformation to the input data before you pass it into the query. This is because without the = sign PhixFlow can make use of more efficient query techniques (such as using bind variables) to optimise performance of the queries.
Note: when you use the = sign to precede an expression, PhixFlow does not automatically map PhixFlow types to database types. This means, for example, that if you pass a String PhixFlow type into a query without = you need only
{_out.CustName}
but with = you need to format the string in the query as required by the syntax of the source database, for example
{="'" + _out.CustName + "'"}
Similarly dates in expressions preceded by = must be formatted as required by the syntax of the source database.
Merging Directly from a Database Collector
If you are collecting data directly into a merge, your query must match the ordering represented by the grouping in the input pipe from the collector into the merge. For example, if you have specified Customer and Product as the grouping attributes on the pipe connecting the collector to the table then you must include the appropriate order by expression in the SQL to ensure the data is sorted by customer and then product.
Generate a Tablefrom a Query
From the hover menu from a database collector on a Analysis Model Window and Toolbars, clicking creates a new Table with the attributes implied by the query. If the Datasource Instance Expression has been entered then the expression will be used to select the datasource, otherwise the first datasource instance alphabetically will be used. The new table will be shown on the model, connected to this database collector.
To use this feature, the supplied query must be pure SQL i.e. should not contain embedded expressions (as described above).
PhixFlow wraps the query with a "where 1= 0" to avoid a long running statement.
Time Data
PhixFlow recognises date and date-time formats, but does not recognise time-only formats, for example HH:mm:ss. When a database collector loads time data, it applies the string data-type. This applies to data from MariaDB and SQL Server databases as there is no TIME data-type in Oracle DB.
Learn More
For links to all pages in this topic, see Analysis Models for Batch Processing Data.
Terminology changes in progress
As part of the redesign of PhixFlow, we are changing the following terms:
dashboard → screen
stream → table
stream attributes → attributes
stream item → record
stream set → recordset
stream view → view
stream item action → record-action
stream action → table-action
driver class → database driver
- No labels