Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online
Authors: Michael Kay
In this example, the value of the variable is a document node, which contains the
One popular way to use a temporary document is as a lookup table. The following stylesheet fragment uses data held in a temporary document to get the name of the month, given its number held in a variable
$mm
.
…
Of course, the sequence constructor does not have to contain constant values as in these two examples; it can also contain instructions such as
This creates the tree illustrated in
Figure 2-10
. Each box shows a node; the three layers are respectively the node kind, the node name, and the string value of the node. Once again, an asterisk indicates that the string value is the concatenation of the string values of the child nodes.
In XSLT 1.0, temporary documents went under the name of
result tree fragments
. I introduced the term
temporary tree
in an earlier edition of this book, because I felt that the phrase
result tree fragment
undervalued the range of purposes to which these structures can be applied. In fact, result tree fragments in XSLT 1.0 were very limited in their capability because of a quite artificial restriction that prevented them being accessed using path expressions. Most vendors ended up circumventing this restriction using an extension function generally named
xx:node-set()
, where
xx
refers to the vendor's particular namespace. In XSLT 2.0, the restriction is gone for good, and temporary documents can now be used in exactly the same way as any source document: they can be used as the result tree for one phase of transformation, and the source tree for the next.
The restrictions in XSLT 1.0 were defined by making result tree fragments a separate data type, with a restricted range of operations available. In XSLT 2.0, a temporary document is a tree rooted at a document node just like any other, and is manipulated using variables or expressions that refer to its root node. (In XSLT 2.0 you can also have trees rooted at elements, or even at attributes or text nodes—though in that case there will only be one node in the tree. In fact, you might sometimes prefer to use a sequence of parentless elements rather than a document. But because of the XSLT 1.0 legacy, a temporary document is what you get when you declare an
as
attribute.)
A temporary document does not necessarily correspond to a well-formed XML document, for example the document node can own text nodes directly, and it can have more than one element node among its children. However, it must conform to the same rules as an XML external parsed entity; for example, all the attributes belonging to an element node must have distinct names.
The ability to use temporary documents as intermediate results in a multiphase transformation greatly increases the options available to the stylesheet designer (which is why the
xx:node-set()
extension function was so popular in XSLT 1.0). The general structure of such a stylesheet follows the pattern:
Some people prefer to use local variables for the intermediate results, some use global variables; it makes little difference.
One way that I often use multiphase transformations is to write a preprocessor for some specialized data source, to convert it into the format expected by an existing stylesheet module that renders it into HTML. For example, to create a glossary as an appendix in a document, you may want to write some code that searches the document for terms and their definitions. Rather than generating HTML directly from this code, you can generate the XML vocabulary used in the rest of the document, and then reuse the existing stylesheet code to render this as phase two of your transformation. This coding style is sometimes referred to as a micro-pipeline.
Because multiphase transformations are often used to keep stylesheets modular, some discipline is required to keep the template rules for each phase separate. I generally do this in two ways:
Summary
In this chapter we explored the important concepts needed to understand what an XSLT processor does, including the following:
The next chapter looks at the structure of an XSLT stylesheet in more detail.
Chapter 3
Stylesheet Structure
This chapter describes the overall structure of a stylesheet. In the previous chapter we looked at the processing model for XSLT and the data model for its source and result trees. In this chapter we will look in more detail at the different kinds of construct found in a stylesheet such as declarations and instructions, literal result elements, and attribute value templates.
Some of the concepts explained in this chapter are tricky; they are areas that often cause confusion, which is why I have tried to explain them in some detail. However, it's not necessary to master everything in this chapter before you can write your first stylesheet—so use it as a reference, coming back to topics as and when you need to understand them more deeply.
The topics covered in this chapter are as follows:
Changes in XSLT 2.0
The important concepts in this chapter are largely unchanged from XSLT 1.0. The most significant changes are as follows:
The Modular Structure of a Stylesheet
In the previous chapter, I described the XSLT processing model, in which a stylesheet defines the rules by which a source tree is transformed into a result tree.
Stylesheets, like programs in other languages, can become quite long and complex, and so there is a need to allow them to be divided into separate modules. This allows modules to be reused, and to be combined in different ways for different purposes: for example, we might want to use two different stylesheets to display press releases on-screen and on paper, but there might be components that both of these stylesheets share in common. These shared components can go in a separate module that is used in both cases.
We touched on another way of using multiple stylesheet modules in the previous chapter, where each module corresponds to one phase of processing in a multiphase transformation.
One can regard the complete collection of modules as a
stylesheet program
and refer to its components as
stylesheet modules
.
One of the stylesheet modules is the
principal stylesheet module
. This is in effect the main program, the module that is identified to the stylesheet processor by the use of an
processing instruction in the source document, or whatever command-line parameters or application programming interface (API) the vendor chooses to provide. The principal stylesheet module may fetch other stylesheet modules, using
The following example illustrates a stylesheet written as three modules: a principal module to do the bulk of the work, with two supporting stylesheet modules, one to obtain the current date, and one to construct a copyright statement.
Example: Using
Source
The input document,
sample.xml
, looks like this:
recommendation published by the World Wide Web Consortium
Stylesheets
The stylesheet uses
There are three modules in this stylesheet program:
principal.xsl
,
date.xsl
, and
copyright.xsl
. The
date.xsl
module uses the XSLT 2.0 function
current-date()
; the other modules will work equally well with XSLT 1.0 or 2.0.
When you run the transformation, you only need to name the principal stylesheet module on the command line—the other modules will be fetched automatically. The way this stylesheet is written, all the modules must be in the same directory.
When an XSLT 2.0 processor sees a module that specifies
version = “1.0”
, it must either run that module in backward-compatibility mode, or it must reject the stylesheet. The latest versions of Saxon and AltovaXML both support backward-compatibility mode, so this is not a problem. Saxon displays a health warning, required by the W3 C specifications, which you can safely ignore in this instance.
If you try to run the stylesheet in XMLSpy, it will fail, reporting an error in the
date.xsl
module. This is because XMLSpy uses
version = “1.0”
as a signal to invoke its XSLT 1.0 processor, but the module
date.xsl
uses XSLT 2.0 features.
principal.xsl
The first module,
principal.xsl
, contains the main logic of the stylesheet.
xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”
version=“1.0”
>
It starts with two
The template rule for
$date
. This variable isn't defined in this stylesheet module, but it is present in the module
date.xsl
, so it can be accessed from here.
The template rule for
copyright
. Again, there is no template of this name in this module, but there is one in the module
copyright.xsl
, so it can be called from here.
Finally, the template rule that matches all other elements (
match = “*”
) has the effect of copying the element unchanged from the source document to the output. The
date.xsl
The next module,
date.xsl
, declares a global variable containing today's date. This calls the
current-date()
function in the standard XPath 2.0 function library, and the XSLT 2.0
format-date()
function, both of which are described in Chapter 13.
xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”
version=“2.0”
xmlns:xs=“http://www.w3.org/2001/XMLSchema”
>
select=“format-date(current-date(), ‘[MNn] [D1o], [Y]’)”/>
Although this is a rather minimal module, there's a good reason why you might want to separate this code into its own module: it's dependent on XSLT 2.0, and you might want to write an alternative version of the function that doesn't have this dependency. Note that we've set
version = “2.0”
on the
version = “1.0”
.