XFA Specification
Chapter 23, FormCalc Specification
Grammar and Syntax
806
Lexical Grammar
This section describes the lexical grammar of the
FormCalc
language. It defines a set of productions,
starting from the nonterminal symbol Input
1
, to describe how sequences of Unicode characters are
translated into a sequence of input elements.
The grammar has as its terminal symbols the characters of the Basic Multilingual Plane (BMP) of the
[Unicode-2.1]
character set; this limitation allows us to hold onto the
"one character = one storage unit"
paradigm the original Unicode standard promised, a bit longer.
Input elements other than white spaces, line terminators and comments form the terminal symbols for the
syntactic grammar of
FormCalc,
and are called tokens. These tokens are the literals, identifiers, keywords,
separators and operators of the
FormCalc
language.
1 Input ::= WhiteSpace | LineTerminator | Comment | Token
The source text for a
FormCalc
calculation is a sequence of characters using the Unicode character
encoding. These Unicode characters are scanned from left to right, repeatedly taking the longest possible
sequence of characters as the next input element.
2 Character ::=
[#x9-#xD] | [#x20-#xD7FF] | [#xE000-#xFFFD]
Note:
Not all
FormCalc
hosting environments recognize these characters, e.g., XML does not allow the
vertical tab (#xB) and form feed (#xC) characters as input.
White Space
White space characters are used to separate tokens from each other and improve readability but are
otherwise insignificant.
3 WhiteSpace ::=
#x9 | #xB | #xC | #x20
These are the horizontal tab (#x9), vertical tab (#xB), form feed (#xC), and space (#x20) characters.
Line Terminators
Line terminators, like white spaces are used to separate tokens and improve readability but are otherwise
insignificant.
4 LineTerminator ::=
#xA | #xD
These are the linefeed (#xA), and carriage return (#xD) characters.
Comments
Comments are used to improve readability but are otherwise insignificant.
A comment is introduced with a semi-colon (;) character, or a pair of slash (/) characters, and continues
until a line terminator is encountered.
5 Comment ::= ';' ( Character \– LineTerminator )
*
|
'/' '/' ( Character \– LineTerminator )
*
Note:
“Notational Conventions” on page 805
explains the significance of the
*
and
?
symbols.
Home Index Bookmark Pages
Pages: Home Index All Pages