Page Comparison

...

Field	Description
Minimum Files	Specifies the minimum number of files that are expected to be found whenever the collector runs. If fewer files are found then this is treated as an error.
Maximum Files	Specifies the maximum number of files that will be processed whenever the collector runs.
Max Records Per File	Specifies the maximum number of recoreds that will be read from each file processed.
Errors Before Rollback	The maximum number of errors PhixFlow will permit during the processing of a file before. Once this number has been exceeded, PhixFlow will abandon the attempted file load.
Parallel Readers	The number of files to process in parallel. If blank, this defaults to 1. If the collector is configured to read files in sequence, this field is ignored and a single file reader is used.
Unreadable Directories	The action to take on finding an unreadable directory when searching a directory hierarchy for files to import. Error: unreadable directories will be reported, and if any are found, the file search will fail. Warning: unreadable directories will be reported, but the file search will continue unaffected. Ignore: unreadable directories will be silently ignored
Character Set	The character encoding to be used. Select a value from the drop down list. If Other if selected, a new box opens and a new character set can be entered. Full list of available character sets can be found here (Canonical Names from both columns can be used).
Column Separator	Select a value from the drop down list. If Other is selected, a new box opens and a new column separator can be entered.
Separator Character	This field is only available if Column Separator = Other. Allows a custom column separator to be entered.
Quote Style	Select a value from the drop down list. If Other if selected, a new box opens and a new quote character can be entered.
Quote Character	This field is only available if Quote Style = Other. Allows a custom quote style character to be entered.
Ignore Missing Columns	This field is only available if File Type = Comma Separated Values. If this flag is set then PhixFlow will not throw an error if the record being read contains fewer columns than expected. If this flag is not set then an error will be reported if there are too few columns.
Ignore Extra Columns	This field is only available if File Type = Comma Separated Values. If this flag is set then PhixFlow will not throw an error if the record being read contains more columns than expected. If this flag is not set then an error will be reported if there are too many columns.
Import Rows Matching	An expression, that must resolve to a Regular Expression, can be specified in this field. PhixFlow will attempt to match each line in the file against the expression, and only those that match will be imported.
Replace Text Matching	Both fields Replace Text Matching and With are expressions that must resolve to Regular Expressions. In each imported line, replace all occurrences of the text matched with Replace Text Matching with With.
With	See description of Replace Text Matching above.
Excel Data Expression	This field is only available if File Type = Excel Spreadsheet. This field should be populated according to the following syntax: "WorksheetName!TopLeftCell:BottomRightCell" e.g "DailyCallsSheet!A1:G100", or a list of such expressions, e.g ["DailyCallsSheet!A1:G100", "A1:B20", "Calls!A1:C100"]. We can also use phixflow expressions, such as: _worksheets. Any value that is not a valid phixflow expression (strings are valid phixflow expressions) will cause this file collector to fail. The following examples show how to populate this field to select various excel worksheets and cell ranges. All rows and columns in the default 1st worksheet:- Leave this field empty as this is the default behaviour Specified columns only on the default 1st worksheet:- "A:C" Specified cell range only on the default 1st worksheet:- "B1:G10" Specified cell range on a specified worksheet:- DailyCallsSheet!A2:F20:- "DailyCallsSheet!A2:F20" List of specified cell ranges on multiple worksheets:- ["DailyCallsSheet!A2:F20", "Calls!A1:C400", "Accounts!A5:F50"] Note that if a worksheet is specified, then the full cell range must also be specified. Hence it is not possible to select a 'worksheet only' or 'columns only for a specified worksheet'. e.g DailyCallsSheet or DailyCallsSheet!A:C are not supported.
Ignore Undefined Values	This checkbox is only available if File Type = Excel Spreadsheet. This checkbox should be ticked if all unsupported excel values such as #N/A, #REF! #DIV/0 etc should be ignored and replaced with null values during processing. In this case a single warning message will be displayed to the user once processing has completed stating the number of unsupported cell values found during the processing and a detailed message about the first unsupported cell value. If this checkbox is unticked then each unsupported excel value will be reported as an individual warning/error in the console and processing will be terminated if the maximum number of errors/warnings is exceeded.
XPath Expression	This field is only available if File Type = XML File or HTML File This field should be populated according to valid XPath syntax. Please see XPath Examples for how to use XPath expressions and how the returned data can be used and evaluated in the corresponding stream attribute expressions.

...

Attribute	Description
_fileName	The name of the file.
_lineNumber	The line number of the record within the file it was read from. The _lineNumber attribute is not available for File Collectors of Type File Details Only
_modifiedDate	The datetime of when the file was last modified. The last modified time of a single file residing within a .gz or a .tgz container can not be determined by phixflow, instead the datetime of when the corresponding gz/tgz container was created will be returned.
_path	The full path to the file which is the result of concatenating the _rootDirectory and the _subDirectory values.
_rootDirectory	The root base directory (if specified) concatenated with the value evaluated in the Collectors 'Input Directory Expr' field.
_size	The size of the file in bytes. The size of a single file residing within a .gz or a .tgz container can not be determined by phixflow, instead a size of -1 will be returned.
_subDirectory	The sub directory relative to the _rootDirectory in which the corresponding file resides.
_worksheet	The name of the current worksheet of the excel file. The _worksheet is not available if the file type is not 'excel'.
_range	The excel range expression that was used. The _range attribute is not available if the file type is not 'excel'.

Anchor

	containers
	containers

...

Versions Compared

Old Version 3

New Version 4

Key