XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (78 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
13.59Mb size Format: txt, pdf, ePub

This shows more explicitly how the sequence constructor contained in the

element creates a single document node, whose content is populated by the

instruction.

If you expand the variable definition in this way, you can also use the attributes
validation
and
type
on the

instruction to invoke validation of the temporary document. These attributes work exactly the same way on

as they do on

, which produces a final result tree.

Another way of creating temporary documents is by copying an existing document node using either of the instructions

(which makes a shallow copy) or

(which makes a deep copy). These instructions both have
validation
and
type
attributes, and when the instructions are used to copy a document node, these attributes work the same way as the corresponding attributes on the

and

instructions.

Because a temporary document exists only locally within your stylesheet, it sometimes makes sense to validate it using a schema that is also local to the stylesheet. To this end, XSLT allows you to write a schema document inline as the content of the

element. For example, if the temporary document has the form:


  

  

  …


then the

declaration might be:


  

             xmlns:m=“http://www.acme.com/ns/local/months”>

  

    

      

         

      

    

  

  

    

      

      

    

  

  


For the complete stylesheet, see
inline.xsl
in the download files for this chapter. If you use inline schema documents, it's good practice to use a unique namespace, to ensure that the schema definitions don't conflict with any other schema definitions that might be loaded.

Validating Individual Elements

Rather than applying validation at the document level, it is possible to invoke validation of specific elements as they are constructed. This can be useful in a number of circumstances:

  • Sometimes you do not have a schema definition for the result document as a whole, but you do have schema definitions for individual elements within it. For example, you may be creating a data file in which the contents of some elements are expected to be valid XHTML.
  • If you are running a transformation whose purpose is to extract parts of the source document, then you may actually know that the result document as a whole is deliberately invalid—the schema for source documents may require the presence of elements or attributes, which you want to exclude from the result document, perhaps for privacy or security reasons. The fact that the result document as a whole has no schema should not stop you from validating those parts that do have one.
  • You may be creating elements in temporary working storage (that is, in variables) that are to be copied or processed before incorporating them into a final result document. It can be useful to validate these elements in their own right, to make sure that they have the type annotations that enable them to be used as input to functions and templates that will only work on elements of a particular type.
  • You may have templates or functions that declare their return types using sequence type descriptors such as
    element(*, us-postal-address)
    . The only way to generate new content that satisfies such a return type is to put it through schema validation.

The usual way of creating a new element node in XSLT is either by using a literal result element or by using the

instruction.

The

instruction has attributes
validation
and
type
, which work in a very similar way to the corresponding attributes of

and

; however, in this case it is only element-level validation that is invoked, not document-level validation.

The same facilities are available with literal result elements. In this case, however, the attributes are named
xsl:validation
and
xsl:type
. This is to avoid any possible conflict with attributes that you want copied to the result document as attributes of the element you are creating.

For example, suppose you want to validate an address. If there is a global element declaration with the name
address
, you might write:


  39

  Lombard Street

  London

  EC1 3CX


If this matches the schema definition of the element declaration for
address
, this will succeed, and the resulting element will be annotated as an address—or more strictly, as an instance of the type associated with the address element, which might be either a named type in the schema, or an anonymous type. In addition, the child elements will also have type annotations based on the way they are defined in the schema, for example the

element might (perhaps) be annotated as type
xs:integer
. If validation fails, the whole transformation is aborted.

What if there is no global element declaration for the


element (typically because it is defined in the schema as a local element declaration within some larger element)? You can still request validation if the element is defined in the schema to have a named type. For example, if the element is declared as:


then you can cause it to be validated by writing:


  39

  Lombard Street

  London

  EC1 3CX


If neither a top-level element declaration nor a top-level type definition is available, you can't invoke validation at this level. The only thing you can do is either:

  • change the schema so that some of the elements and/or types are promoted to be globally defined, or
  • invoke validation at a higher level, where a global element declaration or type definition is available.

You don't need to invoke validation at more than one level, and it may be inefficient to do so. Asking for validation of


in the above example will automatically invoke validation of its child elements. If you also invoked validation of the child elements by writing, say:


  39

  …

then it's possible that the system would do the validation twice over. If you're lucky the optimizer will spot that this is unnecessary, but you could be incurring extra costs for no good reason.

If you ask for validation of a child element, but don't validate its parent element, then the child element will be checked for correctness, but the type annotation will probably not survive the process of tree construction. For example, suppose you write the following:


  39

  Lombard Street

  London

  EC1 3CX


Specifying the
xsl:type
attribute on the

element causes the system to check that the value of the element is numeric, and to construct an element that is annotated as an integer. The result of evaluating the sequence constructor contained in the


element is thus a sequence of four elements, of which the first has a type annotation of
xs:integer
. Evaluating the literal result element

creates a new

element, and forms children of this element from the result of evaluating the contained sequence constructor: The formal model is that the elements in this sequence are copied to form these children. The
xsl:validation
attribute on the

element determines what happens to the type annotations on these child elements. This defaults to either
preserve
or
strip
, depending on the
default-validation
attribute of the containing

(which in turn defaults to
strip
). If the value is
preserve
, the type annotation on the child element is preserved, and if the value is
strip
, then the type annotation on the child element is replaced by
xs:untyped
.

The type of an element never depends on the types of the items used to form its children. For example, suppose that the variable
$i
holds an integer value. Then you might suppose that the construct:


  


would create an element whose type annotation is
xs:integer
. It doesn't—the type annotation will be
xs:untyped
. Atomic values in the sequence produced by evaluating the sequence constructor are always (at least conceptually) converted to strings, and any type annotation in the new element is obtained by validating the resulting string values against the desired type.

This might not seem a very satisfactory design—why discard the type information? The working groups agonized over this question for months. The problem is that there are some cases like this one where retaining the type annotation obviously makes sense; there are many other cases, such as a sequence involving mixed content, where it obviously doesn't make sense; and there are further cases such as a sequence containing a mixture of integers and dates where it could make sense, but the definition would be very difficult. Because the working group found it difficult to devise a clear rule that separated the simple cases from the difficult or impossible ones, they eventually decided on this rather blunt rule: everything is reduced to a string before constructing the new node and validating it.

Note that when you use the
xsl:type
attribute to validate an element, the actual element name can be anything you like. There is no requirement that it should be an element name declared in the schema. It can even be an element name that is declared in the schema, but with a different type (though I can't see any justification for doing something quite so confusing, unless the types are closely related).

All the same considerations apply when creating a new element using the

or

instruction rather than a literal result element. The only difference is that the attributes are now called
validation
and
type
instead of
xsl:validation
and
xsl:type
.

The value of the
type
or
xsl:type
attribute is always a lexical QName, and this must always be the name of a top-level complex type or simple type defined in an imported schema. This isn't the same as the
as
attribute used in declaring the type of variables or functions. Note the following differences:

Other books

Lunch by Karen Moline
Elizabeth Mansfield by The Bartered Bride
Los perros de Riga by Henning Mankell
The Full Circle Six by Edward T. Anthony
The Nicholas Feast by Pat McIntosh
The Standout by Laurel Osterkamp
Operation Sting by Simon Cheshire
How to Curse in Hieroglyphics by Lesley Livingston
Zombies Suck by Z Allora