Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online
Authors: Michael Kay
The output looks like this, adding newlines for clarity.
Take your bow, O Hiawatha,
Take your arrows, jasper-headed,
Take your war-club, Puggawaugun,
And your mittens, Minjekahwan,
And your birch-canoe for sailing,
And the oil of Mishe-Nama.
In this example, the function call does have side effects, because the
reader
variable is an external Java object that holds information about the current position in the file being read, and advances this position each time a line is read from the file. In general, function calls with side effects are dangerous, because XSLT does not define the order in which statements are executed. But in this case, the logic of the stylesheet is such that an XSLT processor would have to be very devious indeed to execute the statements in any order other than the obvious one. The fact that the recursive call on the
read-lines
template is within an
$line
variable means that the processor is forced to read a line, test the result, and then, if necessary, make the recursive call to read further lines.
The next example uses side effects in a much less controlled way, and in this case causes results that will vary from one XSLT processor to another.
Functions with Uncontrolled Side Effects
Just to illustrate the dangers of using functions with side effects, we'll include an example where the effects are not predictable.
Example: A Function with Uncontrolled Side Effects
This example shows how a processor can call extension functions in an unpredictable order, causing incorrect results if the functions have side effects. This can apply even when the extension function is apparently read-only.
Source
Like the previous example, this stylesheet doesn't use a source document.
In this example we'll read an input file containing names and addresses; for example,
addresses.txt
. We'll assume this file is created by a legacy application and consists of groups of five lines. Each group contains a customer number on the first line, the customer's name on the second, an address on lines three and four, and a telephone number on line five. Because that's the way legacy data files often work, we'll assume that the last line of the file contains the string
****
.
15668
Mary Cousens
15 Birch Drive
Wigan
01367-844355
17796
John Templeton
17 Spring Gardens
Wolverhampton
01666-932865
19433
Jane Arbuthnot
92 Mountain Avenue
Swansea
01775-952266
****
Stylesheet
We might be tempted to write the stylesheet as follows (
addresses.xsl
), modifying the previous example:
xmlns:xsl=“http://www.w3.org/1999/XSL/Transform” version=“2.0”
xmlns:xs=“http://www.w3.org/2001/XMLSchema”
xmlns:FileReader=“java:java.io.FileReader”
xmlns:BufferedReader=“java:java.io.BufferedReader”
exclude-result-prefixes=“FileReader BufferedReader”>
select=“BufferedReader:new(FileReader:new($filename))”/>
select=“BufferedReader:readLine ($reader)”/>
select=“BufferedReader:readLine($reader)”/>
select=“BufferedReader:readLine($reader)”/>
select=“BufferedReader:readLine($reader)”/>
select=“BufferedReader:readLine($reader)”/>
What's the difference? This time we are making an assumption that the four variables
$line2
,
$line3
,
line4
, and
$line5
will be evaluated in the order we've written them.
There is no guarantee of this
. The processor is quite at liberty not to evaluate a variable until it is used, and if this happens then
$line3
will be evaluated
before
$line2
, and worse still,
$line5
(because it is never used) might not be evaluated at all, meaning that instead of reading a group of five lines from the file, the template will only read four lines each time it is invoked.
Output
The result, in the case of Saxon, is a disaster.
15 Birch Drive
Wigan
John Templeton
17 Spring Gardens
19433
Jane Arbuthnot
01775-952266
****
Saxon doesn't evaluate a variable until you refer to it, and it doesn't evaluate the variable at all if you never refer to it. This becomes painfully visible in the output, which reveals that it's simply not safe for an XSLT stylesheet to make assumptions about the order of execution of different instructions.
This stylesheet might work on some XSLT processors, but it certainly won't work on all.
The correct way to tackle this stylesheet in XSLT 2.0 is to read the whole text using the
unparsed-text()
function, then to split it into lines using either
tokenize()
function, and then to use grouping facilities to split it into groups of five lines each. There is no need for extension functions at all.
This example raises the question of whether there is any way you can write a call to an extension function and be sure that the call will actually be executed, given that the function is one that returns no result. It's hard to give a categorical answer to this because there is no limit on the ingenuity of optimizers to avoid doing work that makes no contribution to the result tree. However, with Saxon today a function that returns no result is treated in the same way as one that returns
null
, which is interpreted in XPath as an empty sequence. So you can call a void method using:
and provided the
Keeping Extensions Portable
As soon as your stylesheet uses extension functions, or other permitted extensions such as extension instructions or extension attributes, keeping it portable across different XSLT processors becomes a challenge. Fortunately, the design of the XSLT language anticipated this problem, and offers some help.
There are a number of interrogative functions that you can use to find out about the environment that your stylesheet is running in. The most important are as follows:
Use these functions to test whether particular vendor extensions are available before calling them. The best way to do this is using the new
use-when
attribute described in Chapter 3, which allows a section of the stylesheet (perhaps a whole template, perhaps a single
random:random-sequence()
function (defined in EXSLT) if it is available, or to the fractional seconds value from the current time if not.