Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online
Authors: Michael Kay
treat as |
As with other programming languages, the syntax is defined in a set of
production rules
. Each rule defines the structure of a particular construct as a set of choices, sequences, or repetitions.
I took the formal production rules directly from the XPath specification document
(
http://www.w3.org/TR/xpath20
)
, but reordered them for ease of explanation, and I made minor changes to the typography and to some of the production names for ease of reading. I also pulled in those rules from the XML and XML Namespaces standards that the XPath syntax references. I've tried to do this in a way that leaves the original rule clearly recognizable, so you can relate it to the original specification if you need to. However, I have tried in this book to include all the information you need from the XPath specification, so this should only be necessary if you need to see the precise wording of the standard.
Notation
The XPath specification, by and large, uses the same syntax notation as the rest of the family of XML specifications. This is often referred to as extended BNF, though the number of variations you find on the BNF theme can be a little bewildering. I have stuck fairly closely to the notation used in the XPath 2.0 specification, though I have allowed myself a little typographic license in the hope that this adds clarity.
As in the rest of the book, I used French quotation marks
thus
(also known as chevrons or guillemets) to surround pieces of XPath text that you write: I chose this convention partly because these marks stand out more clearly, but more importantly to distinguish these quotation marks unambiguously from quotation marks that are actually part of the expression. So if I say, for example, that literals can be enclosed either in
“
or
'
marks, then it's clear that you don't actually write the chevrons. XPath syntax doesn't use chevrons with any special meaning (though like any other Unicode character, you can use them in string literals and comments), so you can be sure that any chevron you see is not to be included in the expression.
The notations used in production rules are as follows:
Construct | Meaning |
abc | The literal characters abc |
xyz | A construct that matches the production rule named xyz |
P|Q | A choice of P or Q |
P ? | Either P , or nothing |
P * | Zero or more repetitions of P |
P + | One or more repetitions of P |
[i-n] | One of the characters in the range i to n inclusive |
( P ) | A subexpression |