Page Contents
Table of Contents | ||
---|---|---|
|
Grammar Node attributes
The Grammar Node is the outer containing node for the grammar. It allows you to set default attribute values that are inherited by child Attribute Nodes, unless overridden by those child nodes. The Grammar Node may have any of the following attributes:
...
The child nodes of a Repeat Node may repeat 0, 1, or more times. In addition to common attributes (see 3.1) Repeat Nodes may have the following attributes:
Attribute name | Description | Possible Values | Mandatory | Default |
times | The number of times that the child nodes will repeat | Any of:
| No | No limit |
Snippet 1
For example, the following grammar snippet:
Snippet 1:
Code Block | ||
---|---|---|
| ||
<Repeat times="3"> |
...
<Attr name="text" bytes="2" type="String"/> |
...
<Attr name="separator" discriminator="," type="String"/> |
...
</Repeat> |
...
<Attr name="nextAttribute" bytes="4" type="String"/> |
could be used to read the following file:
Code Block | ||
---|---|---|
| ||
aa,bb,cc,John |
The grammar would read exactly three two-character strings, separated by commas, before moving to the next attribute. However, if the exact number of times to carry out the repeat was not specified then the grammar would have no way of knowing when to stop. It would then read the bytes Jo as the next two-character string, and declare an error when it did not find the expected comma in the next byte.
Snippet 2
This next snippet is the same as the first, except that the number of times to repeat is read from a variable defined in an ancestor node.Snippet 2:
<Sequence
Code Block | ||
---|---|---|
| ||
<Sequence name="commaSeparatedBlock" variables="repeatCount"> |
...
<Attr name="repeatCount" bytes="1" type="Integer"/> |
...
<Repeat times="repeatCount"> |
...
<Attr name="text" bytes="2" type="String"/> |
...
<Attr name="separator" discriminator="," type="String"/> |
...
</Repeat> |
...
</Sequence> |
...
<Attr name="nextAttribute" bytes="4" type="String"/> |
To use the attribute repeatCount as a variable in the times parameter of the Repeat block, it must first be declared on an ancestor node in a variables attribute. The scope of the variable is the hierarchy of nodes inside the node that it is declared on. The value in the variable will be set to null when the scope finishes. The variable may only be used within the defined scope.
Snippet 3
The next snippet is the same as the previous, except that the number of times to repeat is calculated by an Expression using a script variable declared on an ancestor node.
Snippet 3:
Code Block | ||
---|---|---|
| ||
<Sequence name="commaSeparatedBlock" scriptVariables="repeatCount"> |
...
<Attr name="repeatCount" bytes="1" type="Integer"/> |
...
<Repeat times="{$commaSeparatedBlock_repeatCount + 2}"> |
...
<Attr name="text" bytes="2" type="String"/> |
...
<Attr name="separator" discriminator="," type="String"/> |
...
</Repeat> |
...
</Sequence> |
...
<Attr name="nextAttribute" bytes="4" type="String"/> |
To use the attribute repeatCount as a script variable in the times parameter of the Repeat block, it must first be declared on an ancestor node in scriptVariables attribute. Note that in Snippet 2 the variable can be used simply as repeatCount; but in the snippet above the script variable must be used with the format $scope_variableName, i.e. $commaSeparatedBlock_repeatCount.
Snippet 4
If there is no way to determine the number of repeats expected, the grammar must include a way of identifying the end of a repeat loop. Typically this is with some sequence of terminating characters. These are specified in the grammar by using the discriminator attribute, as shown in the snippet below.
Snippet 4:
<Repeat>
<Attr
Code Block | ||
---|---|---|
| ||
<Repeat> <Attr name="text" bytes="2" type="String"/> |
...
<Attr name="separator" discriminator="," type="String"/> |
...
</Repeat> |
...
<Attr name="terminator" discriminator="||" type="String"/> |
In the above example the repeat loop will continue until the string "||" is found, when it will stop.
Snippet 5
The bytes that indicate the end of the repeat loop can also be specified as a hex string, using the hexDiscriminator attribute. The snippet below shows an example of this.
Snippet 5:
<Repeat>
<Attr
Code Block | ||
---|---|---|
| ||
<Repeat> <Attr name="text" bytes="2" type="String"/> |
...
<Attr name="separator" discriminator="," type="String"/> |
...
</Repeat> |
...
<Attr name="terminator" discriminator="A0FF" type="Integer" hexDiscriminator="T"/> |
Sequence Node
Sequence Nodes simply define a collection of child nodes. These can be used to define a set of child nodes as a template; or to define the start and end of a sequence of Attribute Nodes that have an overall length specified by another Attribute Node (for example, a Length Node - see 3.4.4) somewhere within the same node hierarchy.
In addition to common attributes (see 3.1) Sequence Sequence Nodes may have the following attributes:
...
Consider a file which includes the following sequence of bytes:
Code Block | ||
---|---|---|
| ||
6aabbcc8 |
The following grammar could be used to read this portion of the file:
Code Block | ||
---|---|---|
| ||
<Attr name="bodyLength" bytes="1" type="Integer"/> |
...
<Sequence name="body"> |
...
<Attr name="attribute1" bytes="2" type="String"/> |
...
<Attr name="attribute2" bytes="2" type="String"/> |
...
<Attr name="attribute3" bytes="2" type="String"/> |
...
<Attr name="attribute4" bytes="1" type="Integer" optional="T"/> |
...
</Sequence> |
...
<Attr name="fileLength" bytes="1" type="Integer"/> |
However, this grammar is ambiguous. It cannot determine whether the value 8 at the end of the data is the optional attribute4, or fileLength. To resolve this, the grammar below sets the length of the Sequence Node body, using the value in the Attribute Node bodyLength:
Code Block | ||
---|---|---|
| ||
<Sequence name="dataBlock" variables="bodyLength"> |
...
... |
...
<Attr name="bodyLength" bytes="1" type="Integer"/> |
...
<Sequence name="body" length="bodyLength"> |
...
<Attr name="attribute1" bytes="2" type="String"/> |
...
<Attr name="attribute2" bytes="2" type="String"/> |
...
<Attr name="attribute3" bytes="2" type="String"/> |
...
<Attr name="attribute4" bytes="1" type="Integer" optional="T"/> |
...
</Sequence> |
...
<Attr name="fileLength" bytes="1" type="Integer"/> |
...
... |
...
</Sequence> |
In this grammar bodyLength is declared as a variable on an ancestor node (in this case, another Sequence Node, dataBlock). In the example data above the Attribute Node bodyLength has the value 6. Because we have a specified the length for the Sequence Node body the grammar can now tell, when it reaches the value 8, that it must be fileLength because it has already used 6 bytes for body.
The snippet above could also be written using a Length Node (see 3.4.4) as:
Code Block | ||
---|---|---|
| ||
<Sequence name="body"> |
...
<Length name="bodyLength" bytes="1" type="Integer"/> |
...
<Attr name="attribute1" bytes="2" type="String"/> |
...
<Attr name="attribute2" bytes="2" type="String"/> |
...
<Attr name="attribute3" bytes="2" type="String"/> |
...
<Attr name="attribute4" bytes="1" type="Integer" optional="T"/> |
...
</Sequence> |
...
<Attr name="fileLength" bytes="1" type="Integer"/> |
The Length Node sets the length of its parent. This length is exclusive of the Length Node itself.
If the first attribute of the data indicated the length of the Sequence Node body inclusive of the Length Node itself, i.e. the data was:
Code Block | ||
---|---|---|
| ||
7aabbcc8 |
then the snippet could be written as:
Code Block | ||
---|---|---|
| ||
<Sequence name="body" variables="bodyLength" length="bodyLength"> |
...
<Attr name="bodyLength" bytes="1" type="Integer"/> |
...
<Attr name="attribute1" bytes="2" type="String"/> |
...
<Attr name="attribute2" bytes="2" type="String"/> |
...
<Attr name="attribute3" bytes="2" type="String"/> |
...
<Attr name="attribute4" bytes="1" type="Integer" optional="T"/> |
...
</Sequence> |
...
<Attr name="fileLength" bytes="1" type="Integer"/> |
Choice Node
A Choice Node specifies that only one or zero of its child nodes (with all its descendant nodes) may be present in the file. Each child node is therefore optional, and setting the optional attribute to F will have no effect.
For the binary file reader to determine which of the child nodes under a Choice Node are in a file, the discriminator attribute must be set on:
- the child node itself, or
- the first child Attribute or Record Node of the child node.
See 3.4.2 Attribute Node for details of discriminators. A Choice Node has no attributes other than the common attributes (see 3.1).
<Choice>
<Sequence .
Code Block | ||
---|---|---|
| ||
<Choice> <Sequence name="option1"> |
...
<Attr name="type1" discriminator="1" type="String"/> |
...
<Attr name="numberValue" bytes="2" type="Integer"/> |
...
</Sequence> |
...
<Sequence name="option2"> |
...
<Attr name="type" discriminator="2" type="String"/> |
...
<Attr name="stringValue" bytes="8" type="String"/> |
...
</Sequence> |
...
</Choice> |
...
<Attr name="final" bytes="2" type="Integer"/> |
In the snippet above, if the first byte encountered is 1 then the next section of the file will be recognised as a Sequence Node option1, the next 2 bytes will be interpreted as a number, before moving on to the Attribute Node final.
If the first byte encountered is 2 then the next section of the file will be recognised as a Sequence Node option2, the next 8 bytes will be interpreted as a string, before moving on to the Attribute Node final.
If the first byte encountered is a 3 then the reader will fail with an error since the snippet only handles 1 or 2 in the first byte. If a 3 is a valid possibility there are three possible solutions:
Solution 1
Add an additional Attribute Node to the Choice Node with a discriminator of 3, as below:<Choice>
<Sequence
Code Block | ||
---|---|---|
| ||
<Choice> <Sequence name="option1"> |
...
<Attr name="type1" discriminator="1" type="String"/> |
...
<Attr name="numberValue" bytes="2" type="Integer"/> |
...
</Sequence> |
...
<Sequence name="option2"> |
...
<Attr name="type2" discriminator="2" type="String"/> |
...
<Attr name="stringValue" bytes="8" type="String"/> |
...
</Sequence> |
...
<Attr name="type3" discriminator="3" type="String/> |
...
</Choice> |
...
<Attr name="final" bytes="2" type="Integer"/> |
Solution 2
Add an Attribute Node to the Choice Node to pick up any value that is not 1 or 2, as below:<Choice>
<Sequence
Code Block | ||
---|---|---|
| ||
<Choice> <Sequence name="option1"> |
...
<Attr name="type1" discriminator="1" type="String"/> |
...
<Attr name="numberValue" bytes="2" type="Integer"/> |
...
</Sequence> |
...
<Sequence name="option2"> |
...
<Attr name="type2" discriminator="2" type="String"/> |
...
<Attr name="stringValue" bytes="8" type="String"/> |
...
</Sequence> |
...
<Attr name="type3" bytes="1" type="String/> |
...
</Choice> |
...
<Attr name="final" bytes="2" type="Integer"/> |
Solution 3
Note that in the snippet above, if no discriminator is set on the Attribute Node type3 then the number of bytes or bits must be set. You cannot have more than one generic option like this in a Choice Node, since this will make the grammar ambiguous.
Solution 3
Make the whole choice node optional, as below:
Code Block | ||
---|---|---|
| ||
<Choice optional="T"> |
...
<Sequence name="option1"> |
...
<Attr name="type1" discriminator="1" type="String"/> |
...
<Attr name="numberValue" bytes="2" type="Integer"/> |
...
</Sequence> |
...
<Sequence name="option2"> |
...
<Attr name="type2" discriminator="2" type="String"/> |
...
<Attr name="stringValue" bytes="8" type="String"/> |
...
</Sequence> |
...
</Choice> |
...
<Attr name="final" bytes="2" type="Integer"/> |
In the snippet above, if the first byte is not 1 or 2, the Choice Node will be skipped and the first byte will be recognised as the Attribute Node final.