Data Nodes

Page Contents

Record Node 

Record Nodes represent records in the file. Record Nodes can be specified as target nodes, which will generate output records of the File Collector

Record Nodes are made up of a collection of nodes, of various types. The Attribute Nodes under a Record Node generate the File Columns in the output records of the File Collector.

In addition to common attributes Record Nodes may have the following attributes:

Attribute name

Description

Possible Values

Mandatory

Default

prefix

If the Record Node has a Name Node, setting this attribute will apply the value of the attribute, followed by an underscore, as a prefix to the name read from the file into the Name Node.

This can be useful if the value read can start with a number, or other value which would be invalid when used in a PhixFlow Attribute Expression.

Any string

No

None

discriminator

If a record has a child Tag Node, then this string will be used as the discriminator for the Tag Node, unless a discriminator is also set on the Tag Node itself, which will override it.

See Attribute Node for details of the discriminator attribute.

Any:

  • string, or
  • hex string, e.g A0FF08, if the hexDiscriminator attribute is set to T.

No

None

hexDiscriminator

Whether the discriminator will be given as a hex string.

If the discriminator cannot be specified as an ASCII string then it can be specified as a hex string, where each pair of characters in the string represent a hex digit.

T for true or F for false

No

F

length

This overall length, as number of bytes, expected for all Attribute Nodes within the hierarchy of the Record Node.

This is helpful if the fields that make up the record in the file include optional fields – setting the length helps the grammar to determine where the record ends.

Any of:

  • a number (the number of bytes)
  • a variable name, the variable holding the number of bytes – this variable must have been declared in the variables attribute of this or an ancestor node
  • an Expression inside curly braces which will calculate a whole number. The Expression may refer to Attribute Nodes which have been declared in the scriptVariables attribute of this or an ancestor node.

No

None

Attribute Node 

An Attribute Node (and each of its five subtypes: Tag Nodes; Length Nodes; Name Nodes; Value Nodes; Bytes Nodes) are the only grammar nodes that will make the file reader read bytes from the file.

An Attribute Node may have child nodes. For example, ASN.1 files have fields which consist of a tag value, followed by a length value, followed by the actual value of the field. This can be expressed in a grammar by placing child Tag, Length and Value (or Bytes) Nodes inside the Attribute Node.

If an Attribute Node has a child Value or Bytes Node then it will not itself read any data from the file, but will instead take its value from the Value or Bytes Node. In this case the bytes, bits, type, stringType, byteOrder and nibbleOrder attributes on the Attribute Node will merely act as defaults for the child Value or Bytes Node (although these defaults can be overridden by the child node if any of these attributes are set on the child node itself).

In addition to common attributes Attribute Nodes may have the following attributes:

Attribute name

Description

Possible Values

Mandatory

Default

prefix

If the Attribute Node has a child Name Node, setting this attribute will apply the value of the attribute, followed by an underscore, as a prefix to the name read from the file into the Name Node.

This can be useful if the value read can start with a number, or other value which would be invalid when used in a PhixFlow Attribute Expression.

Any string

No

None

discriminator

The discriminator specifies the full value of this field (i.e. the field represented by this Attribute Node) in the file.

The discriminator will be converted, according to the type attribute, into the bytes that will be read in the file when this field is found.

The discriminator can be used to determine whether optional fields are included in the file.

Any:

  • string, or
  • hex string, e.g A0FF08, if the hexDiscriminator attribute is set to T.
    Note that the file reader will expect the minimum number of bytes required to hold the value of the discriminator. E.g. if the discriminator is the number 2, the file reader will expect a single byte. If the field uses more than the minimum number of bytes, specify the full value of the discriminator as a hex string. E.g. the discriminator is the number 2, but is represented in the file with 4 bytes, padded with leading 0s; the discriminator would be "00000002".

No

None

hexDiscriminator

Whether the discriminator will be given as a hex string.

If the discriminator cannot be specified as an ASCII string then it can be specified as a hex string, where each pair of characters in the string represent a hex digit.

T for true or F for false

No

None

bytes

The number of bytes to be read from the file for this field.

Any of:

  • a number (the number of bytes)
  • a variable name, the variable holding the number of bytes – this variable must have been declared in the variables attribute of this or an ancestor node
  • an Expression inside curly braces which will calculate a whole number. The Expression may refer to Attribute Nodes which have been declared in the scriptVariables attribute of this or an ancestor node.

    This attribute is not mandatory if any of:
  • the discriminator attribute is set, since the number of bytes can be determined directly from the discriminator
  • this node has a child Value or Bytes Node with bytes attribute set
  • the bits attribute is set
  • this node has a child Length Node
  • this node has a preceding sibling Length Node

No, in certain cases (see notes in Possible Values column)

Otherwise yes.

None

bits

The number of bits to be read from the file. This value will take precedence over the number of bytes if both are specified.

Any of:

  • a number (the number of bytes)
  • a variable name, the variable holding the number of bits – this variable must have been declared in the variables attribute of this or an ancestor node
  • an Expression inside curly braces which will calculate a whole number. The Expression may refer to Attribute Nodes which have been declared in the scriptVariables attribute of this or an ancestor node.

    This attribute is not mandatory if any of:
  • the discriminator attribute is set, since the number of bits can be determined directly from the discriminator
  • this node has a child Value or Bytes Node with bits attribute set
  • the bytes attribute is set
  • this node has a child Length Node
  • this node has a preceding sibling Length Node

No, in certain cases (see notes in Possible Values column)

Otherwise yes.

None

type

The type of the field to be read. This attribute dictates how the bytes read from the file will be converted.

Any of:

  • BCD - the field is a binary coded decimal. Commonly used for telephone numbers where each byte contains two numerals, each numeral represented by 4 bits. The result is returned as a string.
  • Integer - a standard integer.
  • String - a string.
  • Float - a floating point number. This type assumes that the number has been encoded as a string which needs to be converted to a float.
  • DateTime - a date and time. This type assumes that the date and time have been encoded as a string which needs to be converted to a date and time.
  • Date - a date. This type assumes that the date has been encoded as a string. The string may include a time component but this will be set to zero (midnight) in the output.
  • BERTag – the field is encoded as an ASN.1 tag in BER format. You do not need to specify a number of bytes for a BERTag field since the length is encoded as part of the data.
  • BERLength - this node is an ASN.1 length value encoded in BER format. You do not need to specify a number of bytes for a BERLength attribute since the length is encoded as part of the data.
    The type you specify in the grammar is not case sensitive, e.g. it can be "STRING", "String" or "string".

No, if this node has a child Value or Bytes Node, with the attribute type set.

Otherwise yes.

None

dateFormat

The format of a date in the file. This should only be set if the type has been set as Date or DateTime. If no date format is specified then the reader will try a variety of possible date formats in turn. It is therefore more efficient to specify a date format, if possible.

A valid date format string. Valid date formats are documented in the PhixFlow online help.

No

None

stringType

This character set to use when converting the bytes read from the file into a string, or converting a discriminator value into the bytes that will be read in the file.

Any string which specifies a valid Java character set, such as UTF8 (or for MySQL UTF8mb3).

No

Taken from Grammar Node

byteOrder

The order that bytes will be read from the file when converting into a value.

L (for little endian) or B (for big endian).

No

Taken from Grammar Node

nibbleOrder

The order that nibbles (4 bit blocks) will be read from the file when converting into a value.

L (for little endian) or B (for big endian).

No

Taken from Grammar Node

tagType

If this grammar is for an ASN.1 file and this node is for a tag value – the type of tag.

The ASN.1 definition document that describes the file using ASN.1 notation should specify the type of each tag. If not, see the notes at the end of the Tag Node description.

This setting dictates how the tag value is translated.

Any of:

  • Universal
  • Application
  • Context
  • Private
    The tagType you specify in the grammar is not case sensitive, e.g. it can be "CONTEXT", "Context" or "context".

No

Context

berConstruct

If this grammar is for an ASN.1 file and this node is for a tag value - this flag indicates whether the tag is for a constructed value (i.e. a record with sub values) or a simple value.

The ASN.1 definition document that describes the file using ASN.1 notation should specify whether this tag is for a construct variable. If not, see the notes at the end of the Tag Node description.

T for true or F for false

No

F

Tag Node 

The Tag Node is a sub type of the Attribute Node. It has the same attributes as the Attribute Node, and will read bytes from the file to calculate a tag value.

Tag Node discriminators

The only difference between a Tag Node and an Attribute Node is that a Tag Node will use the discriminator from its most immediate Sequence, Attribute or Record Node ancestor (any intervening Control Nodes being ignored), if no discriminator has be set on the Tag Node itself. If the most immediate Sequence/Attribute/Record Node ancestor does not have a discriminator specified, then the Tag Node will keep looking up the ancestor tree until it finds a Sequence/Attribute/Record Node with the discriminator set. 

This is useful when creating templates such as the following:

<Attribute name="attrTemplate" template="T">
  <Tag name="tag" type="berTag"/>
  <Length name="length" type="berLength"/>
  <Value name="value"/>
</Attribute>


The discriminator is not set on the Tag Node in the template since the template may be used in several places in the grammar, with different discriminator values. For example:

<Attribute definedBy="attrTemplate" name="attr1" type="String" discriminator="27"/>
<Attribute definedBy="attrTemplate" name="attr2" type="Integer" discriminator="99"/>


Each of the two Attribute Node definitions above use the template, and so inherit all the child nodes of the template. However, across these two cases the child Tag Nodes must use different values for the discriminator. This is possible because each child Tag Node will inherit the discriminator set on its parent Attribute Node.

Tag Node types

If a Tag Node is given the type BERTag then the attributes tagType and berConstruct must be set correctly, because the bytes stored in the file for an ASN.1 tag will depend on the values given for tag type and BER construct in the ASN.1 file definition. 

A full ASN.1 file definition will specify these values for each tag. 

For example, an ASN.1 file definition may specify a tag as context-specific, constructed with a value of 5. The grammar for this would be:

<Tag name="asn1Tag" type="BERTag" discriminator="5" tagType="Context" berConstruct="T"/>


If a full ASN.1 file definition is not available then you must assume that the tag value in the file is the full hex or decimal value of the bytes for this tag. With the same example tag as above, the documentation will give the full value of the bytes representing the tag as A5 in hex or 165 in decimal. In this case, the grammar can be written either with:

  • a Tag Node with type set to BERTag; follow the steps in ASN.1 Tag Encoding to determine the appropriate discriminator, tagType and berConstruct settings (this will result in the grammar above)
  • a Tag Node with type set to Integer, discriminator set to a hex string (in the example, A5) and the hexDiscriminator set to T, as shown below:
<Tag name="hexTag" type="Integer" discriminator="A5" hexDiscriminator="T"/>

Length Node 

The Length Node is a sub type of the Attribute Node. It has the same attributes as the Attribute Node, and will read bytes from the file to obtain a length value.

The only difference between a Length Node and an Attribute Node is that once the Length Node has read a value from the file, it will use that value to set the length attribute on its most immediate Attribute, Sequence or Record Node ancestor (any intervening Control Nodes being ignored).

How this length is used depends on the type of ancestor node which is set by it. 

If a Length Node sets the length on an ancestor Attribute Node, then the length will be the default bytes value of a child Value or Bytes Node of the Attribute Node. This default can be overridden by setting the bits or bytes on the child node. For example:

<Attr name="ASN1Attribute" type="Integer">
  <Length name="attrLength" type="Integer"/>
  <Value name="contents"/>
</Attr>


In this example the bytes attribute is not needed on the Value Node, since this is automatically inherited from the parent Attribute Node, which in turn receives it from the Length Node.

The following grammar is equivalent:

<Attr name="ASN1Attribute" type="Integer" variables="attrLength">
  <Attr name="attrLength" type="Integer"/>
  <Value name="contents" bytes="attrLength"/>
</Attr> 


If a Length Node sets the length on a Sequence or Record Node, then the length determines how many bytes are expected across all remaining child Attribute Nodes of that Record or Sequence Node after the Length Node (i.e. not including any child Attribute Nodes before the Length Node). See Control Nodes Attributes.

Name Node 

The Name Node is a sub type of the Attribute Node. It has the same attributes as the Attribute Node, and will read bytes from the file to obtain a name. 

The only difference between a Name Node and an Attribute Node is that the Name Node will modify the name attribute of the most immediate Attribute Node ancestor (all intervening Control Nodes will be ignored). If the Attribute Node being modified has a prefix attribute set, the new name of the Attribute Node will be the prefix followed by an underscore, followed by the value of the Name Node.

If more than one Name Node is specified then the second and subsequent Name Nodes will append an underscore, followed by the value of the Name Node, to the end of the name created by the first Name Node.

Value Node 

The Value Node is a sub type of the Attribute Node. It has the same attributes as the Attribute Node, and will read bytes from the file to calculate a value.

The only difference between a Value Node and an Attribute Node is that the Value Node will use the bytes read to set the value of its most immediate Attribute Node ancestor (all intervening Control Nodes will be ignored).

If the bytes, bits, type, stringType, byteOrder and nibbleOrder attributes are not set on a Value Node then the corresponding attributes on the most immediate Attribute Node ancestor will apply. Just as for Tag Nodes inheriting discriminators, this is useful when creating reusable templates (see Tag Node). Rather than specifying these attributes in the template, they can be set on the Attribute Nodes that use the template. These attribute values will be inherited by the child nodes copied from the template. In this way the same template can be used for a range of Integer, String or BCD attributes with different byte, nibbleOrder, etc. attributes.

If a Value Node has the type attribute set to String, and the parent node has a different type, e.g. Integer, then the Value Node will first read the bytes from the file as a string, but when the value is passed to the parent Attribute Node it will be converted to an integer. This can be useful when using the binary file reader to read an ASCII file (e.g. a CSV or JSON file), converting string representations of integers into actual integers.

For example:

<Attribute name="csvDate" type="Date">
  <Value name="csvDateAsString" type="String">
    <Repeat name="csvValueRepeat">
      <Bytes name="csvValueBytes" bytes="1"/>
    </Repeat>
  </Value>
</Attribute>
<Attribute name="comma" type="String" discriminator=","/>

The grammar above will keep reading bytes until a comma is reached. This collection of bytes will be passed to the Value Node, which will convert them to a string. This string will then be passed to the outer Attribute Node, which will convert the string into a date.

The same technique could be used to convert strings to Integers, Floats, Dates or DateTimes.

Bytes Node 

The Bytes Node is a sub type of the Attribute Node. It has the same attributes as the Attribute Node, and will read bytes from the file to obtain a value. 

As with a Value Node it will use the values from the bytes, bits, byteOrder and nibbleOrder attributes of the most immediate Attribute Node ancestor, if these attributes are not set on the Bytes Node itself.

A Bytes Node does not need the type or stringType attributes set, since it does not convert any bytes read. Instead it adds bytes read into a buffer on its most immediate Attribute Node ancestor. These bytes will then be converted into a value according to the type and stringType attributes of the ancestor Attribute Node.

Bytes Nodes can be used to construct grammars that will read an arbitrary number of bytes until a particular byte string is encountered, and then convert those bytes into a value. This can be useful when building grammars to read CSV or similar files. 

For example:

<Attribute name="csvValue" type="String">
  <Repeat name="csvValueRepeat">
    <Bytes name="csvValueBytes" bytes="1"/>
  </Repeat>
</Attribute>
<Attribute name="comma" type="String" discriminator=","/>

In the snippet above the repeat loop will keep reading 1 byte at a time, appending to the buffer of the ancestor Attribute Node csvValue. Because there is no times attribute set on the Repeat Node, it will only stop once a set of bytes is reached that match the discriminator of the first attribute outside of the repeat loop. See Control Nodes Attributes. In the example grammar, it will stop at the next comma. The Attribute Node will then complete and convert all of the bytes added to its buffer into a string value.