Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online
Authors: Michael Kay
Although the production rules in XPath define the operator precedence, they do not impose any type checking. This follows the practice of most modern language specifications, where rules for type checking are regarded as being enforced in a second phase of processing, after the raw parsing of the syntax. It would be hard to define all the type checking rules in the grammar, because many of them operate at a distance. Since the type-checking rules can't all be defined in the grammar, the language designers decided to go to the other extreme, and define none of them in the grammar.
This means that the grammar allows many kinds of expression that are completely nonsensical, such as
3|‘bread’
(where
|
is the set union operator). It's left to the type-checking rules to throw this out: the rules for the
|
operator say that its operands must be of type
node()*
, that is, sequences of nodes. Think of an analogy with English—there are sentences that are perfectly correct grammatically but still nonsense: “An easy apple only trumpets yesterday.”
Where to Start
Some people prefer to present the syntax of a language bottom-up, starting with the simplest constructs such as numbers and names, while others prefer to start at the top, with a construct like
Program
or
Expression
.
What I've chosen to do is to start at the top, with the section
Expressions
, which is really just an opportunity to provide an overview of the grammar, and then work bottom-up, starting with the basic building blocks of the language in this chapter and progressing through the other operators in the next four chapters. Each of these chapters describes a reasonably self-contained set of expressions that you can write in XPath. There's no obviously logical order to these, but I decided to present the simpler operators and expressions first, to make life as easy as possible if you decide to read the chapters sequentially. This also corresponds broadly with the order in which material is presented in the XPath specification itself.
If you want to find where in the book a particular construct is described, you might find the syntax summary in Appendix A helpful.
Many languages distinguish the lexical rules, which define the format of basic tokens such as names, numbers, and operators, from the syntactic rules, which define how these tokens are combined to form expressions and other higher-level constructs. The XPath specification includes both syntactic and lexical production rules, but they are not quite as cleanly separated as in some languages. The main distinction between the two kinds of rule is that whitespace can be freely used between lexical tokens but not within a lexical token. I will try to distinguish carefully between syntax rules and lexical rules as we come across them in the grammar. The main difference is that when something is described as a syntax rule you can use whitespace and comments freely between the symbols, which is not the case for a lexical rule.
Expressions
The top-level construct in XPath (the entry point to the list of productions) is called
Expr
. This is described with the following syntax:
Expression | Syntax |
Expr | ExprSingle ( , ExprSingle)* |
ExprSingle | ForExpr |
| QuantifiedExpr | |
| IfExpr | |
| OrExpr |
These rules indicate that an
Expr
is a list of
ExprSingle
expressions separated by commas, and an
ExprSingle
is either a
ForExpr
, a
QuantifiedExpr
, an
IfExpr
, or an
OrExpr
.
Here are some examples of the constructs mentioned in these rules:
Construct | Example |
Expr | 1 to 3, 5, 7, 11, 13 |
ExprSingle | any of the examples below |
ForExpr | for $i in 1 to 10 return $i * $i |
QuantifiedExpr | some $i in //item satisfies exists($i/*) |
IfExpr | if (exists(@price)) then @price else 0 |
OrExpr | @price > 3 or @cost < 2 |