...
Field | Description |
---|---|
Minimum Files | Specifies the minimum number of files that are expected to be found whenever the collector runs. If fewer files are found then this is treated as an error. |
Maximum Files | Specifies the maximum number of files that will be processed whenever the collector runs. |
Max Records Per File | Specifies the maximum number of recoreds that will be read from each file processed. |
Errors Before Rollback | The maximum number of errors PhixFlow will permit during the processing of a file before. Once this number has been exceeded, PhixFlow will abandon the attempted file load. |
Parallel Readers | The number of files to process in parallel. If blank, this defaults to 1. If the collector is configured to read files in sequence, this field is ignored and a single file reader is used. |
Unreadable Directories | The action to take on finding an unreadable directory when searching a directory hierarchy for files to import.
|
Character Set | The character encoding to be used. Select a value from the drop down list. If Other if selected, a new box opens and a new character set can be entered. Full list of available character sets can be found here (Canonical Names from both columns can be used). |
Column Separator | Select a value from the drop down list. If Other is selected, a new box opens and a new column separator can be entered. |
Separator Character | This field is only available if Column Separator = Other. Allows a custom column separator to be entered. |
Quote Style | Select a value from the drop down list. If Other if selected, a new box opens and a new quote character can be entered. |
Quote Character | This field is only available if Quote Style = Other. Allows a custom quote style character to be entered. |
Ignore Missing Columns | This field is only available if File Type = Comma Separated Values. If this flag is set then PhixFlow will not throw an error if the record being read contains fewer columns than expected. If this flag is not set then an error will be reported if there are too few columns. |
Ignore Extra Columns | This field is only available if File Type = Comma Separated Values. If this flag is set then PhixFlow will not throw an error if the record being read contains more columns than expected. If this flag is not set then an error will be reported if there are too many columns. |
Import Rows Matching | An expression, that must resolve to a Regular Expression, can be specified in this field. PhixFlow will attempt to match each line in the file against the expression, and only those that match will be imported. |
Replace Text Matching | Both fields Replace Text Matching and With are expressions that must resolve to Regular Expressions. In each imported line, replace all occurrences of the text matched with Replace Text Matching with With. |
With | See description of Replace Text Matching above. |
Excel Data Expression | This field is only available if File Type = Excel Spreadsheet. This field should be populated according to the following syntax: "WorksheetName!TopLeftCell:BottomRightCell" e.g "DailyCallsSheet!A1:G100", or a list of such expressions, e.g ["DailyCallsSheet!A1:G100", "A1:B20", "Calls!A1:C100"]. We can also use phixflow expressions, such as: _worksheets. Any value that is not a valid phixflow expression (strings are valid phixflow expressions) will cause this file collector to fail.
Note that if a worksheet is specified, then the full cell range must also be specified. Hence it is not possible to select a 'worksheet only' or 'columns only for a specified worksheet'. e.g DailyCallsSheet or DailyCallsSheet!A:C are not supported. |
Ignore Undefined Values | This checkbox is only available if File Type = Excel Spreadsheet. This checkbox should be ticked if all unsupported excel values such as #N/A, #REF! #DIV/0 etc should be ignored and replaced with null values during processing. In this case a single warning message will be displayed to the user once processing has completed stating the number of unsupported cell values found during the processing and a detailed message about the first unsupported cell value. |
XPath Expression | This field is only available if File Type = XML File or HTML File This field should be populated according to valid XPath syntax. Please see XPath Examples for how to use XPath expressions and how the returned data can be used and evaluated in the corresponding stream attribute expressions. |
...
Attribute | Description |
---|---|
_fileName | The name of the file. |
_lineNumber | The line number of the record within the file it was read from. The _lineNumber attribute is not available for File Collectors of Type File Details Only |
_modifiedDate | The datetime of when the file was last modified. The last modified time of a single file residing within a .gz or a .tgz container can not be determined by phixflow, instead the datetime of when the corresponding gz/tgz container was created will be returned. |
_path | The full path to the file which is the result of concatenating the _rootDirectory and the _subDirectory values. |
_rootDirectory | The root base directory (if specified) concatenated with the value evaluated in the Collectors 'Input Directory Expr' field. |
_size | The size of the file in bytes. The size of a single file residing within a .gz or a .tgz container can not be determined by phixflow, instead a size of -1 will be returned. |
_subDirectory | The sub directory relative to the _rootDirectory in which the corresponding file resides. |
_worksheet | The name of the current worksheet of the excel file. The _worksheet is not available if the file type is not 'excel'. |
_range | The excel range expression that was used. The _range attribute is not available if the file type is not 'excel'. |
Anchor | ||||
---|---|---|---|---|
|
...