Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online
Authors: Michael Kay
Warning: on line 23 of …/code/ch04/poem-to-html.xsl:
The complex type of element poem does not allow a child element named poet
When you're dealing with a complex schema it's very easy to make this kind of mistake in your path expressions, and allowing the XSLT processor to check your code against the schema can make a big difference to the ease of diagnosing such errors. Note that it's only a warning, not an error: the expression is actually legal, but Saxon-SA is warning you that it selects nothing. AltovaXML, at the time of writing, doesn't do this level of checking.
Validating the Result Document
We've seen what you can achieve by using knowledge of the schema for the source document. You can also request validation of the output of the transformation. For example, if you have written the stylesheet to generate XHTML, you can ask for it to be validated by writing your first template rule as follows:
In this example, there is nothing that says what the expected type of the output is. What
validate = “strict”
means is that the outermost element of the result document (for example,
You could argue that validating the output from within the stylesheet is no different from running the transformation and then putting the output through a schema processor to check that it's valid. However, once you try developing a stylesheet this way, you will find that the experience is very different. If you put the output file through a freestanding schema processor once the transformation is complete, the schema processor will give you error messages in terms of a position within the result document. You will then have to open the result document in a text editor, find out what's wrong with it, find the instruction in the stylesheet that generated the incorrect output, and then correct the stylesheet. Working with a schema processor that's integrated into your XSLT processor is much more direct: In most cases the error message will tell you directly which instruction in the stylesheet needs to be changed. This makes for a much more rapid development cycle.
There is another advantage—in many cases it should be possible for a schema-aware XSLT processor to tell you that the output will be invalid before you even try running the stylesheet against a source document. That is, it should be able to report some of your errors at compile time. This gives you an even quicker turnaround in fixing errors, and more importantly, it means that the ability to detect bugs in your code is less dependent on the completeness of your test suite. Stylesheet programming is often done without much regard to the traditional disciplines of software engineering—testing tends to be less than thorough. So anything that reduces the risk of failures once the stylesheet is in live use is to be welcomed.
The following example shows how this works.
Example: Validating the Result Document
This example shows how validation of a result document can be invoked from within the stylesheet.
Source
The source document is a poem such as
theHill.xml
, which is listed (in part) on page 167.
Stylesheet
The following stylesheet
poem-to-xhtml.xsl
is designed to format this poem into XHTML, checking as it does so that the output is valid XHTML. It contains a deliberate error: see if you can spot it.
xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”
xpath-default-namespace=“http://poetry.org/ns”
xmlns=“http://www.w3.org/1999/xhtml”>
schema-location=“http://www.w3.org/2002/08/xhtml/xhtml1-strict.xsd”/>
by
name”/>
(
separator=“-”/>)
Note that this stylesheet validates the output, but doesn't require validating the input. The only reason for this is to demonstrate one feature at a time, and to show that input validation and output validation are quite independent of each other.
This stylesheet fetches the XHTML schema from the W3 C web site. That's not a practical thing to do for something that you run frequently. If you run behind a proxy server then it will probably be cached automatically, but in other cases you may prefer to make a local copy.
Output
When this stylesheet is run using Saxon-SA 9.0, the output is:
Error on line 17 of …/code/ch04/poem-to-xhtml.xsl:
XTTE1510: Attribute align is not permitted in the content model of
the complex type of element h1. Failed to compile stylesheet. 1 error
detected.
The error message should be clear enough: the
align
attribute is not permitted in strict XHTML. You could fix it by using the schema for transitional XHTML rather than strict XHTML, or better, by replacing the
align
attribute with
style = “text-align: center”
.
Previous releases of Saxon reported this error as a runtime validation error, while still pinpointing the line in the stylesheet where the problem occurred. The XSLT specification isn't prescriptive about this: it allows implementations to do the validation either at compile-time or at runtime.
AltovaXML reports the error at runtime, like this:
Validation Error
Attribute ‘align’ is not allowed in element
…\code\ch04\poem-to-xhtml2.xsl Line 11, Character 4
Like Saxon-SA, it gives a clear message about what is wrong, though it's not quite so precise in identifying the location in the stylesheet of the offending instruction (line 11 is the
Validation of a result document can be controlled using either the
validation
attribute or the
type
attribute of the
element. You can use only one of these: they can't be mixed. The
validation
attribute allows four values, whose meanings are explained in the table below.
Attribute value | Meaning |
strict | The result document is subjected to strict validation. This means that there must be an element declaration for the outermost element of the result document in some schema, and the structure of the result document must conform to that element declaration. |
lax | The result document is subjected to lax validation. This means that the outermost element is validated against a schema if a declaration for that element name can be located; if not, the system assumes the existence of an element declaration that allows any content for that element. The children of the element are also subjected to lax validation, and so on recursively. So any elements in the tree that are declared in a schema must conform to their declaration, but for other elements, there are no constraints. |
preserve | This option means that no validation is applied at the document level, but if any elements or attributes within the result tree have been constructed using node-level validation (as described in the next section), then the type annotations resulting from that node-level validation will be preserved in the result tree. These node annotations are only relevant, of course, if the result tree is passed to another process that understands them. If the result tree is simply serialized, it makes no difference whether type annotations are preserved or not. |
strip | This option means that no validation is applied at the document level, and moreover, if any elements or attributes within the result tree have been constructed using node-level validation (as described in the next section), then the type annotations resulting from that node-level validation will be removed from the result tree. Instead, all elements will be given a type annotation of xs:untyped , and attributes will have the type annotation xs:untypedAtomic . |
The other way of requesting validation of the result tree is through the
type
attribute. If the
type
attribute is specified, its value must be a QName, which must match the name of a global type definition in an imported schema. In practice this will almost invariably be a complex type definition. The rules to pass validation are as follows: