Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

FieldDescription
Minimum FilesSpecifies the minimum number of files that are expected to be found whenever the collector runs. If fewer files are found then this is treated as an error.
Maximum FilesSpecifies the maximum number of files that will be processed whenever the collector runs.
Max Records Per FileSpecifies the maximum number of recoreds that will be read from each file processed.
Errors Before RollbackThe maximum number of errors PhixFlow will permit during the processing of a file before. Once this number has been exceeded, PhixFlow will abandon the attempted file load.
Parallel ReadersThe number of files to process in parallel. If blank, this defaults to 1.

If the collector is configured to read files in sequence, this field is ignored and a single file reader is used.

Unreadable DirectoriesThe action to take on finding an unreadable directory when searching a directory hierarchy for files to import.
  • Error: unreadable directories will be reported, and if any are found, the file search will fail.
  • Warning: unreadable directories will be reported, but the file search will continue unaffected.
  • Ignore: unreadable directories will be silently ignored
Character SetThe character encoding to be used.
Select a value from the drop down list. If Other if selected, a new box opens and a new character set can be entered. Full list of available character sets can be found here (Canonical Names from both columns can be used).
Column SeparatorSelect a value from the drop down list. If Other is selected, a new box opens and a new column separator can be entered.
Separator CharacterThis field is only available if Column Separator = Other. Allows a custom column separator to be entered.
Quote StyleSelect a value from the drop down list. If Other if selected, a new box opens and a new quote character can be entered.
Quote CharacterThis field is only available if Quote Style = Other. Allows a custom quote style character to be entered.
Ignore Missing ColumnsThis field is only available if File Type = Comma Separated Values.

If this flag is set then PhixFlow will not throw an error if the record being read contains fewer columns than expected. If this flag is not set then an error will be reported if there are too few columns.

Ignore Extra ColumnsThis field is only available if File Type = Comma Separated Values.

If this flag is set then PhixFlow will not throw an error if the record being read contains more columns than expected. If this flag is not set then an error will be reported if there are too many columns.

Import Rows MatchingAn expression, that must resolve to a Regular Expression, can be specified in this field. PhixFlow will attempt to match each line in the file against the expression, and only those that match will be imported.
Replace Text MatchingBoth fields Replace Text Matching and With are expressions that must resolve to Regular Expressions. In each imported line, replace all occurrences of the text matched with Replace Text Matching with With.
WithSee description of Replace Text Matching above.
Excel Data Expression

This field is only available if File Type = Excel Spreadsheet.

This field should be populated according to the following syntax: is an expression that must evaluate to a list of ranges with the format "WorksheetName!TopLeftCell:BottomRightCell" e.g

If this field is left blank, PhixFlow will read the first worksheet it finds in the excel file (even if this is a hidden sheet) with a range covering the whole sheet.

E.g. if just a single range is needed: "DailyCallsSheet!A1:G100"or

E.g. if a list of such expressions, e.g ranges is required: ["DailyCallsSheet!A1:G100", "A1:B20", "Calls!A1:C100"]We can also use phixflow expressions, such as: _worksheets. The _worksheets can be used to return the list of available worksheets on the current excel file. Any value that is not a valid phixflow expression (strings are valid phixflow expressions) will cause this file collector to fail. The following examples show how to populate this field to select various excel worksheets and cell ranges.

Remember that in all cases PhixFlow will only read the columns that have been defined in the File Columns tab.

Because this field is an expression, the resulting list can be generated with any valid PhixFlow expression. You can also use the internal variable _worksheets which gives you the list of worksheets that PhixFlow found in the file. See the example below for how you might use this.

Examples

  • All rows and columns in the default 1st worksheet:- Leave leave this field empty as this is the default behaviour
  • Specified columns only on the default 1st worksheet:- "A:C"
  • Specified cell range only on the default 1st worksheet:- "B1:G10"
  • Specified cell range on a specified worksheet:- "DailyCallsSheet!A2:F20"
  • List of specified cell ranges on multiple worksheets:- ["DailyCallsSheet!A2:F20", "Calls!A1:C400", "Accounts!A5:F50"]
 
  • Examine the list of worksheets that have been found and specify ranges for only certain sheets, if they exist:
do (    $rangeList = [],
    forEach($sheet, _worksheets,
        if ( listContains(["sheetA", "sheetB"], $sheet),
            addElement($rangeList, $sheet + "!A1:B10")
        )
    ),
    $rangeList
)

This expression will evaluate to the list of a maximum of two ranges, if both worksheets sheetA and sheetB exist - in this case, this is equivalent to ["sheetA!A1:B10", "sheetB!A1:B10"]. Crucially, if a sheet is not found, the range will not be included. This is important for error handling. If you specify a range that is not in the excel file PhixFlow will error. So if you are not sure that a worksheet will always be included, write an expression like this to check, and only specify the range when the sheet is found.

Related internal variables

See notes for the internal variables _worksheet and _range below. These can be used in stream attribute expressions to record the source worksheet and range for data you have loaded into PhixFlow.

Constraints

Note that if a worksheet is specified, then the full cell range must also be specified. Hence it is not possible to select a 'worksheet only' or 'columns only for a specified worksheet'. e.g DailyCallsSheet or DailyCallsSheet!A:C are not supported.

Ignore Undefined ValuesThis checkbox is only available if File Type = Excel Spreadsheet.

This checkbox should be ticked if all unsupported excel values such as #N/A, #REF! #DIV/0 etc should be ignored and replaced with null values during processing. In this case a single warning message will be displayed to the user once processing has completed stating the number of unsupported cell values found during the processing and a detailed message about the first unsupported cell value.

If this checkbox is unticked then each unsupported excel value will be reported as an individual warning/error in the console and processing will be terminated if the maximum number of errors/warnings is exceeded.

XPath ExpressionThis field is only available if File Type = XML File or HTML File

This field should be populated according to valid XPath syntax. Please see XPath Examples for how to use XPath expressions and how the returned data can be used and evaluated in the corresponding stream attribute expressions.

...