Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online
Authors: Michael Kay
.*,\s*[A-Z]{2},\s*USA\s*$’)”>
regex=“
∧
(.*),\s*([A-Z]{{2}}),\s*USA\s*$”>
does not match regex
The effect of these rules is that we end up with event records of the form:
The names “Long Island” and “Southampton” are classified as levels 6 and 7 because we don't know enough about them to classify them more accurately: levels up to 5 have reserved meanings, whereas 6 and above are available for arbitrary purposes. The ordering of levels is significant: higher levels are intended to represent a finer granularity of place name, which is why we have reversed the order of the original components of the name.
Debugging the Stylesheet
This completes the presentation of the stylesheet used to convert the data from GEDCOM 5.5 to 6.0 format. I'd like to add some notes, however, from my experience of developing this stylesheet. The vast majority of my errors in coding this stylesheet, unless they were basic XSLT or XPath errors, were detected as a result of the on-the-fly validation of the result document against its schema. These errors included:
In the case of Saxon, a few of these errors are detected at stylesheet compile time, but most are reported while executing the stylesheet, and in nearly all cases the error message identifies exactly where the stylesheet is wrong. For example, if the code in the initial template is changed to read:
then the transformation fails with the message:
Validation error on line 27 of ged55-to-6.xsl:
XTTE1510: Required attribute @Target is missing
(See http://www.w3.org/TR/xmlschema-1/#cvc-complex-type clause 4)
This process caught quite a few basic XSLT coding errors. For example, I originally wrote:
in which the curly braces around
@REF
have been omitted. This resulted in the error message:
Validation error on line 64 of ged55-to-6.xsl:
The value ‘@REF’ is not a valid NCName
The error message arises because in the absence of curly braces, the system has tried to use
@REF
as the literal value of the
Ref
attribute, and this is not allowed because the attribute is defined in the schema to have type
IDREF
, which is a subtype of
NCName
. An
NCName
cannot contain an
@
character.
Similarly, errors in the picture of the
format-date()
function call were picked up because they resulted in a string that did not match the picture defined in the schema for the
StandardDate
type.
However, schema validation of the result tree will not pick up all errors. I had some trouble, for example, getting the regular expression for matching place names right, but the errors simply resulted in the output file containing an empty
Displaying the Family Tree Data
What we want to do now is to write a stylesheet that displays the data in a GEDCOM file in HTML format. We want the display to look something like the following screenshot (see
Figure 19-1
).