XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (231 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
3.97Mb size Format: txt, pdf, ePub

To take an example of how this is a problem in practice, suppose that the stylesheet defined a variable as follows:


   

   

      

     


Now suppose that this variable is never referenced, or is referenced only as
$dummy[1]
. Is the result document produced, or not? Normally, an XSLT optimizer will avoid evaluating variables or parts of variables that aren't used, but this strategy causes problems if the evaluation of a variable has a side effect.

The way that the XSLT specification has dealt with this problem is essentially to say that you can only use

when the sequence constructor you are evaluating is destined to form the content of a final result tree. When the stylesheet starts executing, this condition is true for the sequence constructor contained in the first template to be evaluated, and it remains true except when you evaluate

or similar elements such as

,

, and

. You also can't use

while evaluating the result of a stylesheet function defined using

, or while computing the content of

,

,

,

,

,

, or

. This restriction is a runtime rule rather than a compile-time rule; for example, you can use

within

if the template is called from within

, but not if it's called from within

.

XSLT processors are allowed to evaluate instructions in any order. This means that you can't reliably predict the order in which final result trees get written. There is a rule preventing a stylesheet from writing two different result trees with the same URI, because if overwriting were allowed, the results would be nondeterministic. There is also a rule saying that it's an error to attempt to write a result tree and then read it back again using the
document()
function: this would be a sneaky way of exploiting side effects and making your stylesheet dependent on the order of execution. In practice, processors may have difficulty detecting this error, and you might get away with it, especially if you use different spellings of the URI, for example by writing to
file:///c:/temp.xml
and then reading from
FILE:///c:/temp.xml
.

The fact that order of execution is unpredictable has another consequence: if a transformation doesn't run to completion, because a runtime error occurred (or perhaps because

was used with
terminate=“yes”
), then it's unpredictable as to whether a particular final result tree was output before the termination. In practice most processors only exploit the freedom to change the order of execution when evaluating variables or functions, so you are unlikely to run into this problem in practice.

Usage

There are two main reasons for using

. One reason is to generate multiple output files; the other is to exercise control over the validation or serialization of the principal output file.

Generating multiple output files is very common in publishing applications. The product documentation for Saxon, for example, consists of around 450 HTML files, which are generated by a single transformation from 20 input XML files. Sometimes it's better to do the transformation in two stages: First, split a large XML document into several small XML documents, and then convert each of these into HTML independently.

One common approach is to generate one principal output file and a whole family of secondary output files. The principal output file can then serve as an index. Usually a key part of the process will be the generation of hyperlinks that allow the user to navigate within the document family. This means you will need some mechanism for generating the filenames of the output files. Exactly how you do this depends on what's available in your input: one approach is to use the
generate-id()
function, which allocates a unique identifier to every node in your input documents.

Examples

The

instruction is often used to break up large documents into manageable chunks. In the section for

on page 433, there is an example of a stylesheet that breaks up one of Shakespeare's plays to produce a cover page together with one page per scene. But here we'll illustrate the principle with a much smaller document.

Example: Creating Multiple Output Files

This example takes a poem as input, and outputs each stanza to a separate file. A more realistic example would be to split a book into its chapters, but I wanted to keep the files small.

Source

The source file is
poem.xml
. It starts:


Rupert Brooke

1912

Song


And suddenly the wind comes soft,

And Spring is here again;

And the hawthorn quickens with buds of green

And my heart with buds of pain.



My heart all Winter lay so numb,

The earth so dead and frore,


Stylesheet

The stylesheet is
split.xsl
.

We want to start a new output document for each stanza, so we use the

instruction in the template rule for the

element. Its effect is to switch all output produced by its sequence constructor to a different output file. In fact, it's very similar to the effect of an

element that creates a tree, except that the tree, instead of being a temporary tree, is serialized directly to an output file of its own:


                version=“2.0”>


   

      

      

   



   

                  select=“concat(‘verse’, position(), ‘.xml’)”/>

   

   

        

   



To run this example under Saxon, you need to make sure that an output file is supplied for the principal output document. This determines the base output URI, and the other output documents will be written to locations that are relative to this base URI. For example:

java -jar c:\saxon\saxon9.jar -t -o:c:\temp\index.xml

-s:poem.xml -xsl:split.xsl

This will write the index document to
c:\temp\index.xml
, and the verses to files such as
c:\temp\verse2.xml.
The
-t
option is useful because it tells you exactly where the files have been written.

Output

The principal output file contains the skeletal poem below (indented for legibility):



  Song

  Rupert Brooke

  1912

  

  

  


Three further output files
verse1.xml
,
verse2.xml
, and
verse3.xml
are created in the same directory as the principal output file. Here is
verse1.xml
:



  And suddenly the wind comes soft,

  And Spring is here again;

  And the hawthorn quickens with buds of green

  And my heart with buds of pain.


Other books

Web of Angels by Lilian Nattel
A Busted Afternoon by Pepper Espinoza
Burning Emerald by Jaime Reed
Rebellion by William H. Keith
New Beginnings by Laurie Halse Anderson
Minds That Hate by Bill Kitson
Season for Surrender by Theresa Romain
Invincible Summer by Alice Adams
Four Spirits by Sena Jeter Naslund
Lie of the Land by Michael F. Russell