XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (691 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
12.27Mb size Format: txt, pdf, ePub

The output looks like this, adding newlines for clarity.



Take your bow, O Hiawatha,

Take your arrows, jasper-headed,

Take your war-club, Puggawaugun,

And your mittens, Minjekahwan,

And your birch-canoe for sailing,

And the oil of Mishe-Nama.


In this example, the function call does have side effects, because the
reader
variable is an external Java object that holds information about the current position in the file being read, and advances this position each time a line is read from the file. In general, function calls with side effects are dangerous, because XSLT does not define the order in which statements are executed. But in this case, the logic of the stylesheet is such that an XSLT processor would have to be very devious indeed to execute the statements in any order other than the obvious one. The fact that the recursive call on the
read-lines
template is within an

instruction that tests the
$line
variable means that the processor is forced to read a line, test the result, and then, if necessary, make the recursive call to read further lines.

The next example uses side effects in a much less controlled way, and in this case causes results that will vary from one XSLT processor to another.

Functions with Uncontrolled Side Effects

Just to illustrate the dangers of using functions with side effects, we'll include an example where the effects are not predictable.

Example: A Function with Uncontrolled Side Effects

This example shows how a processor can call extension functions in an unpredictable order, causing incorrect results if the functions have side effects. This can apply even when the extension function is apparently read-only.

Source

Like the previous example, this stylesheet doesn't use a source document.

In this example we'll read an input file containing names and addresses; for example,
addresses.txt
. We'll assume this file is created by a legacy application and consists of groups of five lines. Each group contains a customer number on the first line, the customer's name on the second, an address on lines three and four, and a telephone number on line five. Because that's the way legacy data files often work, we'll assume that the last line of the file contains the string
****
.

15668

Mary Cousens

15 Birch Drive

Wigan

01367-844355

17796

John Templeton

17 Spring Gardens

Wolverhampton

01666-932865

19433

Jane Arbuthnot

92 Mountain Avenue

Swansea

01775-952266

****

Stylesheet

We might be tempted to write the stylesheet as follows (
addresses.xsl
), modifying the previous example:

    xmlns:xsl=“http://www.w3.org/1999/XSL/Transform” version=“2.0”

    xmlns:xs=“http://www.w3.org/2001/XMLSchema”

    xmlns:FileReader=“java:java.io.FileReader”

    xmlns:BufferedReader=“java:java.io.BufferedReader”

    exclude-result-prefixes=“FileReader BufferedReader”>




    

               select=“BufferedReader:new(FileReader:new($filename))”/>

    

        

    



   

   

                  select=“BufferedReader:readLine ($reader)”/>

   

     

                    select=“BufferedReader:readLine($reader)”/>

     

                    select=“BufferedReader:readLine($reader)”/>

     

                    select=“BufferedReader:readLine($reader)”/>

     

                    select=“BufferedReader:readLine($reader)”/>

     

       


         

         

       


       Attn: 

     

     

         

     

   



What's the difference? This time we are making an assumption that the four variables
$line2
,
$line3
,
line4
, and
$line5
will be evaluated in the order we've written them.
There is no guarantee of this
. The processor is quite at liberty not to evaluate a variable until it is used, and if this happens then
$line3
will be evaluated
before
$line2
, and worse still,
$line5
(because it is never used) might not be evaluated at all, meaning that instead of reading a group of five lines from the file, the template will only read four lines each time it is invoked.

Output

The result, in the case of Saxon, is a disaster.


   

15 Birch Drive
Wigan

   


   Attn: Mary Cousens

   

John Templeton
17 Spring Gardens

   


   Attn: 17796


   

19433
Jane Arbuthnot

   


   Attn: 01666-932865


   

01775-952266
****

   


   Attn: Swansea


Saxon doesn't evaluate a variable until you refer to it, and it doesn't evaluate the variable at all if you never refer to it. This becomes painfully visible in the output, which reveals that it's simply not safe for an XSLT stylesheet to make assumptions about the order of execution of different instructions.

This stylesheet might work on some XSLT processors, but it certainly won't work on all.

The correct way to tackle this stylesheet in XSLT 2.0 is to read the whole text using the
unparsed-text()
function, then to split it into lines using either

or the
tokenize()
function, and then to use grouping facilities to split it into groups of five lines each. There is no need for extension functions at all.

This example raises the question of whether there is any way you can write a call to an extension function and be sure that the call will actually be executed, given that the function is one that returns no result. It's hard to give a categorical answer to this because there is no limit on the ingenuity of optimizers to avoid doing work that makes no contribution to the result tree. However, with Saxon today a function that returns no result is treated in the same way as one that returns
null
, which is interpreted in XPath as an empty sequence. So you can call a void method using:


and provided the

instruction itself is evaluated, the method will always be called.

Keeping Extensions Portable

As soon as your stylesheet uses extension functions, or other permitted extensions such as extension instructions or extension attributes, keeping it portable across different XSLT processors becomes a challenge. Fortunately, the design of the XSLT language anticipated this problem, and offers some help.

There are a number of interrogative functions that you can use to find out about the environment that your stylesheet is running in. The most important are as follows:

  • The
    system-property()
    function, which allows you to determine the XSLT version supported and the name and version of the XSLT processor itself.
  • The
    function-available()
    function, which allows you to determine whether a particular extension function is available. This is particularly useful when you are using a third-party library such as EXSLT, where the same functions may be available under a number of different XSLT processors.

Use these functions to test whether particular vendor extensions are available before calling them. The best way to do this is using the new
use-when
attribute described in Chapter 3, which allows a section of the stylesheet (perhaps a whole template, perhaps a single

instruction) to be conditionally included or excluded from the stylesheet at compile time. For example, the following code sets a variable to the result of the
random:random-sequence()
function (defined in EXSLT) if it is available, or to the fractional seconds value from the current time if not.

Other books

In the Penal Colony by Kafka, Franz
Stallo by Stefan Spjut
Safe at Home by Mike Lupica
A Wedding in Apple Grove by C. H. Admirand
Seeking Persephone by Sarah M. Eden
Ties That Bind by Phillip Margolin
Heavenly Angel by Heather Rainier