Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online
Authors: Michael Kay
The string value of a node can be obtained by using the
string()
function described in Chapter 13. This should not be confused with the
xs:string()
constructor, which works differently when applied to a node: it extracts the typed value of the node, as described in the next section, and then converts the typed value to a string. This might not give precisely the same result. For example, if the attribute is declared as an
xs:boolean
, and the actual attribute is written as
ok=“1”
, then the result of
string(@ok)
will be the string
1
, while the result of
xs:string(@ok)
will be the string
true
.
In XPath 2.0, most operations use the typed value of a node, which is discussed in the next section. The only time the string value is used directly is when you explicitly call the
string()
function, or one or two other functions such as
string-length()
or
normalize-space()
. But the result differs from the typed value only if the node has been validated using a schema.
The Typed Value of a Node
The typed value of a node reflects the content of the node as it appears after schema validation. The typed value is available using the
data()
function described in Chapter 13; it is also obtained implicitly as the result of the process of atomization, described on page 165.
Schema validation only applies to element and attribute nodes, so let's get the other kinds of nodes out of the way first. For every other kind of node, the typed value is the same as the string value, which is defined in the previous section. However, for document nodes, namespace nodes, and text nodes, the value is labeled as
xs:untypedAtomic
, while for comments and processing instructions it is labeled as
xs:string
. There is, as one might expect, some tortuous logic behind this apparently arbitrary distinction: labeling a value as
xs:untypedAtomic
enables the value to be used in contexts where a value other than a string is required, whereas a value labeled as
xs:string
can only be used where that is the type expected. There are plausible scenarios where one might want to use the content of document nodes, namespace nodes, and text nodes in non-string contexts, but it's hard to think of similar justifications for comments and processing instructions.
Let's return to elements and attributes, which are the cases where the typed value comes into its own.
First of all, if you're working on a document that has no schema, or that has not been validated against a schema, or if you're using an XSLT processor that doesn't support schema processing, then the typed value of an element or attribute is the same as the string value, and is labeled with the type
xs: untypedAtomic
. This is very close to the situation with XPath 1.0, which didn't support schema processing at all. It means that when you use an expression that returns an element or attribute node (for example, path expressions like
title
or
@price
), then they take on the type expected by the context where you use them. For example, you can use
@price
as a number by writing
@price * 0.8
, or you can use it as a string by writing
substring-after(@price,
‘$’)
. The typed value of the attribute, which is simply the string value as written in the source document, will be converted to a number or to a string as required by the context. If the conversion fails, for example, if you try to use the value as an integer when it isn't a valid integer, then you get a runtime error.