XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (63 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
6.59Mb size Format: txt, pdf, ePub

As with extension functions, the term
extension instruction
covers both nonstandard instructions provided by the vendor and nonstandard instructions implemented by a user or third party. There is no requirement that an XSLT implementation must allow users to define new extension instructions, only that it should behave in a particular way when it encounters extension instructions that it cannot process.

Where a product does allow users to implement extension instructions (two products that do so are Saxon and Xalan), the mechanisms and APIs involved are likely to be rather more complex than those for extension functions, and the task is not one to be undertaken lightly. However, extension instructions can offer capabilities that would be very hard to provide with extension functions alone.

If there is an extension instruction in a stylesheet, then all XSLT processors will recognize it as such, but in general some will be able to handle it and others won't (because it is defined by a different vendor). As with extension functions, the rule is that a processor mustn't fail merely because an extension instruction is present; it should fail only if an attempt is made to evaluate it.

There are two mechanisms to allow stylesheet authors to test whether a particular extension instruction is available: the
element-available()
function and the

instruction.

The
element-available()
function works in a very similar way to
function-available()
. You can use it in a
use-when
attribute to include stylesheet code conditionally. In this case, however, you can also do the test at evaluation time if you prefer, because calls to unknown extension instructions don't generate an error unless then are executed. For example:


   

     

   

   

      *** Sorry, moonshine is off today ***

   


Note that at the time
element-available()
is called, the prefix for the extension element (here
acme
) must have been declared in a namespace declaration, but it does not need to have been designated as an extension element.

The

instruction (which is fully described on page 316, in Chapter 6) provides an alternative way of specifying what should happen when an extension instruction is not available. The following example is equivalent to the previous one.

   xmlns:acme=“http://acme.co.jp/xslt”

   xsl:extension-element-prefixes=“acme”>

   

      *** Sorry, moonshine is off today ***

   


When an extension instruction is evaluated, and the XSLT processor does not know what to do with it, it should evaluate any child

element. If there are several

children, it should evaluate them all. Only if there is no

element should it report an error. Conversely, if the XSLT processor can evaluate the instruction, it should ignore any child

element.

The specification doesn't actually say that an extension instruction must allow an

child to be present. There are plenty of XSLT instructions that do not allow

as a child, for example,

and

. However, an extension instruction that didn't allow

would certainly be against the spirit of the standard.

Vendor-defined or user-defined elements at the top level of the stylesheet are not technically extension instructions, because they don't appear within a sequence constructor; therefore, the namespace they appear in does not need to be designated as an extension namespace.

Whitespace

Whitespace handling can be a considerable source of confusion. When the output of a stylesheet is HTML, you can get away without worrying too much about it, because except in some very specific contexts HTML generally treats any sequence of spaces and newlines in the same way as a single space. But with other output formats, getting spaces and newlines where you want them, and avoiding them where you don't, can be crucial.

There are two issues:

  • Controlling which whitespace in the source document is significant and therefore visible to the stylesheet.
  • Controlling which whitespace in the stylesheet is significant, because significant whitespace in the stylesheet is likely to get copied to the output.

Whitespace is defined as any sequence of the following four characters.

Character
Unicode Codepoint
Tab
x09
Newline
x0A
Carriage Return
x0D
Space
x20

The definition in XSLT is exactly the same as in XML itself. Other characters such as non-breaking-space (xA0), which is familiar to HTML authors as the entity reference
 
, may use just as little black ink as these four, but they are not included in the definition.

There are some additional complications about the definition. Writing a character reference

is in many ways exactly the same as hitting the space bar on the keyboard, but in some circumstances it behaves differently. The character reference

will be treated as whitespace by the XSLT processor, but not by the XML parser, so you need to understand which rules are applied at which stage of processing.

The XML standard makes some attempt to distinguish between significant and insignificant whitespace. Whitespace in elements with element-only content is considered insignificant, whereas whitespace in elements that allow
#PCDATA
content is significant. However, the distinction depends on whether a validating parser is used or not, and in any case, the standard requires both kinds of whitespace to be notified to the application. XSLT 2.0 (unlike 1.0) says that by default, if the source document is validated against a DTD or schema, insignificant whitespace (that is, whitespace in elements with element-only content) is ignored. In other cases, handling of whitespace can be controlled from the stylesheet (using the

and

declarations, which are fully described in Chapter 6).

Other books

Finding Sophie by Irene N.Watts
The Galloping Ghost by Carl P. LaVO
The First Bad Man by Miranda July
The Claimed by Caridad Pineiro
Californium by R. Dean Johnson
Think! by Edward de Bono
Coercion by Tigner, Tim
Skylark by Jenny Pattrick