Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online
Authors: Michael Kay
In my experience, the most pervasive argument is the last one: it's surprising how often complex applications construct or modify stylesheets on the fly. But like it or not, the XML-based syntax is now an intrinsic feature of the language that has both benefits and drawbacks. It does make the language verbose, but in the end, the number of keystrokes has very little bearing on the ease or difficulty of solving particular transformation problems.
In XSLT 2.0, the long-windedness of the language has been reduced considerably by increasing the expressiveness of the non-XML part of the syntax, namely XPath expressions. Many computations that required five lines of XSLT code in 1.0 can now be expressed in a single XPath expression. Two constructs in particular led to this simplification: the conditional expression (
if..then..else
) in XPath 2.0; and the ability to define a function in XSLT (using
f
by a user-written function
f
, you can replace the five lines in the example with:
The decision to base the XSLT syntax on XML has proved its worth in several ways that I would not have predicted in advance:
No Side Effects
The idea that XSL should be a declarative language free of side effects appears repeatedly in the early statements about the goals and design principles of the language, but no one ever seems to explain
why
: what would be the user benefit?
A function or procedure in a programming language is said to have side effects if it makes changes to its environment; for example, if it can update a global variable that another function or procedure can read, or if it can write messages to a log file, or prompt the user. If functions have side effects, it becomes important to call them the right number of times and in the correct order. Functions that have no side effects (sometimes called pure functions) can be called any number of times and in any order. It doesn't matter how many times you evaluate the area of a triangle, you will always get the same answer; but if the function to calculate the area has a side effect such as changing the size of the triangle, or if you don't know whether it has side effects or not, then it becomes important to call it once only.
I expand further on this concept in the section on Computational Stylesheets in Chapter 17, page 985.
It is possible to find hints at the reason why this was considered desirable in the statements that the language should be equally suitable for batch or interactive use, and that it should be capable of
progressive rendering
. There is a concern that when you download a large XML document, you won't be able to see anything on your screen until the last byte has been received from the server. Equally, if a small change were made to the XML document, it would be nice to be able to determine the change needed to the screen display, without recalculating the whole thing from scratch. If a language has side effects, then the order of execution of the statements in the language has to be defined, or the final result becomes unpredictable. Without side effects, the statements can be executed in any order, which means it is possible, in principle, to process the parts of a stylesheet selectively and independently.
What it means in practice to be free of side effects is that you cannot update the value of a variable. This restriction is something many users find very frustrating at first, and a big price to pay for these rather remote benefits. But as you get the feel of the language and learn to think about using it the way it was designed to be used, rather than the way you are familiar with from other languages, you will find you stop thinking about this as a restriction. In fact, one of the benefits is that it eliminates a whole class of bugs from your code. I shall come back to this subject in Chapter 17, where I outline some of the common design patterns for XSLT stylesheets and, in particular, describe how to use recursive code to handle situations where in the past you would probably have used updateable variables to keep track of the current state.
Rule-Based
The dominant feature of a typical XSLT stylesheet is that it consists of a set of template rules, each of which describes how a particular element type or other construct should be processed. The rules are not arranged in any particular order; they don't have to match the order of the input or the order of the output, and in fact there are very few clues as to what ordering or nesting of elements the stylesheet author expects to encounter in the source document. It is this that makes XSLT a declarative language, because you specify what output should be produced when particular patterns occur in the input, as distinct from a procedural program where you have to say what tasks to perform in what order.
This rule-based structure is very like CSS, but with the major difference that both the patterns (the description of which nodes a rule applies to), and the actions (the description of what happens when the rule is matched) are much richer in functionality.
Example: Displaying a Poem
Let's see how we can use the rule-based approach to format a poem. Again, we haven't introduced all the concepts yet, and so I won't try to explain every detail of how this works, but it's useful to see what the template rules actually look like in practice.
Input
Let's take this poem as our XML source. The source file is called
poem.xml
, and the stylesheet is
poem.xsl
.
Output
Let's write a stylesheet such that this document appears in the browser, as shown in
Figure 1-8
.
Stylesheet
It starts with the standard header.
xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”
version=“1.0”>
Now we write one template rule for each element type in the source document. The rule for the
In XSLT 2.0 we could replace the four
This takes advantage of the fact that the type system for the language now supports ordered sequences. The
,
operator performs list concatenation and is used here to form a list containing the
The template rules for the
select = “.”
), and surround it within appropriate HTML tags to define its display style.
By
The template rule for the
The rule for
). The
position()
function to determine the relative position of the current line. It then outputs the contents of the line, followed by an empty HTML
element to end the line.
And to finish off, we close the