XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (66 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
12.17Mb size Format: txt, pdf, ePub

Why is all this relevant? As we've seen, the

element allows you to control what happens to whitespace nodes (those shown in the immediately preceding example), but it doesn't let you do anything special with whitespace characters that appear in ordinary text nodes (those shown as ordinary spaces).

Most of the whitespace nodes in this example are immediate children of the

element, so they could be stripped by writing:


This would leave the remaining whitespace node intact (the one after the end tag of the

element). Whitespace nodes are retained on the source tree unless you ask for them to be stripped, either by using

, or by using some option provided by the XML parser or schema processor during the building of the tree.

The Effect of Stripping Whitespace Nodes

There are two main effects of stripping whitespace nodes, as done in the

element in the earlier example:

  • When you use

    to process all the children of the

    element, the whitespace nodes aren't there, so they don't get selected, which means they don't get copied to the result tree. If they had been left on the source tree, then by default they would be copied to the result tree.
  • When you use

    or the
    position()
    or
    count()
    functions to count nodes, the whitespace nodes aren't there, so they aren't counted. If you had left the whitespace nodes on the tree, then the

    ,

    , and

    elements would be nodes 2, 4, and 6 instead of 1, 2, and 3.

There are cases where it's important to keep the whitespace nodes. Consider the following.


Edited by James Clark[email protected]


The diamond represents a space character that needs to be preserved, but because it is not adjacent to any other text, it would be eligible for stripping. In fact, whitespace is nearly always significant in elements that have mixed content (that is, elements that have both element and text nodes as children). Figure 18-3 on page 1005 shows a live example of what goes wrong when such spaces are stripped.

If you want to strip all the whitespace nodes from the source tree, you can write:


If you want to strip all the whitespace nodes except those within certain named elements, you can write:



If any elements in the document (either the source document or the stylesheet) use the XML-defined attribute
xml:space = “preserve”
, this takes precedence over these rules: whitespace nodes in that element, and in all its descendants, will be kept on the tree unless the attribute is canceled on a descendant element by specifying
xml:space = “default”
. This allows you to control on a per-instance basis whether whitespace is kept, whereas

controls it at the element-type level.

Whitespace Nodes in the Stylesheet

For the stylesheet itself, whitespace nodes are all stripped, with two exceptions, namely whitespace within an

element, and whitespace controlled by the attribute
xml:space = “preserve”
. If you explicitly want to copy a whitespace text node from the stylesheet to the result tree, write it within an

element, like this:




The only reason for using

here rather than an actual newline is that it's more clearly visible to the reader; it's also less likely to be accidentally turned into a newline followed by tabs or spaces. Writing the whitespace as a character reference doesn't stop it being treated as whitespace by XSLT, because the character references will have been expanded by the XML parser before the XSLT processor gets to see them.

Another way of coding the previous fragment in XSLT 2.0 is to write:

              separator=“ ”/>

You can also cause whitespace text nodes in the stylesheet to be retained by using the option
xml:space = “preserve”
. Although this is defined in the XML specification, its defined effect is to advise the application that whitespace is significant, and XSLT (which is the application in this case) will respect this. In XSLT 1.0, this sometimes caused problems because certain elements, such as

and

, do not allow text nodes as children, even whitespace-only text nodes. Many processors, however, were forgiving on this. XSLT 2.0 has clarified that in situations where text nodes are not allowed, a whitespace-only text node is now stripped, despite the
xml:space
attribute. (However, an element that must always be empty, such as

, must be completely empty; whitespace-only text nodes are not allowed within these elements.)

Other books

Lost Innocents by Patricia MacDonald
Lem, Stanislaw by The Cyberiad [v1.0] [htm]
Dr. Knox by Peter Spiegelman
Damage Control by J. A. Jance
The Incorruptibles by John Hornor Jacobs
Titanium by Linda Palmer
Heart of Stars by Kate Forsyth
I'll Be Your Somebody by Savannah J. Frierson