xml-dev - Re: [xml-dev] XSLT 2.0 Processor = XSLT Processor + W3 XML Schema Valida

Re: [xml-dev] XSLT 2.0 Processor = XSLT Processor + W3 XML Schema Valida

[ Lists Home | Date Index | Thread Index ]

To: "Roger L. Costello" <costello@mitre.org>
Subject: Re: [xml-dev] XSLT 2.0 Processor = XSLT Processor + W3 XML Schema Validator
From: Jeni Tennison <jeni@jenitennison.com>
Date: Wed, 20 Aug 2003 13:40:10 +0100
Cc: xml-dev@lists.xml.org
Envelope-to: xml-dev@lists.xml.org
In-reply-to: <3F43553B.D0BE046E@mitre.org>
Organization: Jeni Tennison Consulting Ltd
References: <3F43553B.D0BE046E@mitre.org>
Reply-to: Jeni Tennison <jeni@jenitennison.com>

Hi Roger,

> What do you see to be the pros and cons of this new validation
> capability in XSLT?

I view it as an absolute necessity, given the new data model that we
have in XSLT 2.0.

The most important addition to the data model between XPath 1.0 and
XPath 2.0 is that every element and attribute has a type and a typed
value. I think it would be really peculiar if XSLT 2.0 couldn't
generate a data model in which the types of elements and attributes
were anything aside from xs:anyType and xs:anySimpleType. For example,
it would mean:

  - you couldn't do an identity transformation
  
  - you would have to add validation steps between transformations in
    a pipeline in order to add types back into the data model

  - if you had multi-step transformations in a single stylesheet, none
    of the steps aside from the first would be able to take advantage
    of typing information

Note that I'm stressing that the aim here is to enable *type
annotation*. The fact that you can only type annotation through
validation is, in a way, by-the-by. The point isn't the checking of
the content of an element or attribute, it's the fact that after
validating the node, you can stamp it with a type annotation, which
you can then use to inform further processing.

On the other hand, the fact that you have validation in the stylesheet
means that you will get an error if the stylesheet accidentally
generates an element or attribute that isn't valid according to the
type that you've declared for it. This ensures that you don't generate
invalid documents, and can help, during authoring, in pinpointing
where the problems are in your stylesheet that lead to invalid
documents being generated.

One of the advantages that you cite is:

> 4. Powerful: on the plus side, this can bring a lot of interesting
> new capabilities to XSLT. I could even see it eliminating the need
> for a Schema validator.

One of the ways in which I see the type annotation mechanism being
used is by users who don't have a schema but nevertheless want to
ensure that the elements and attributes in the input document they're
processing are interpreted correctly. You can do this using the 'type'
attribute. Because the built-in types are always available, they can
do this by having one step that copies and annotates the source
document with the built-in types, and another step that then
transforms the result of this process.

Thus, enabling type annotation via XSLT means that XSLT authors don't
have to rely on a schema or DTD being available in order to use
type-aware processing, at least if you're happy only using the
built-in types.

One of the disadvantages that you cite is:

> 3. Exclusive party: note that validation is against a W3 XML Schema.
> RelaxNG, Schematron, etc are excluded from this party.

This is something that I think we need to work on. The Data Model and
the rest of XPath/XSLT is fairly neutral about which schema language
is used to validate/annotate a document and to supply type
definitions, but you're right that the Validation section in XSLT 2.0
is very much oriented around validation against an XML Schema schema.
At the very least, it needs to address validation against a DTD.

Another disadvantage you give is:

> 1. Increases complexity: it increases the complexity of stylesheets
> since now you have transformation concerns as well as validation
> concerns.

I think that this is an inevitable side-effect of the fact that you're
now creating a more complex data model. On the other hand, if you're
not interested in validation/type annotation at all, then you can just
ignore it; indeed, if you want to create a stylesheet that's
guaranteed to run on all XSLT 2.0 processors then you *should* ignore
it, since a Basic XSLT processor will reject stylesheets that specify
that validation/type annotation should take place.

Unlike many of the other new features in XSLT 2.0, we've not really
had the ability to generate and use type annotations and schema
information before. (MSXML 4.0 has some support for using type
information, but I haven't seen any reports on how that gets used.) I
suspect that we'll see it used in far more weird and wonderful ways
than we can anticipate now...

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/

References:
- XSLT 2.0 Processor = XSLT Processor + W3 XML Schema Validator
  - From: "Roger L. Costello" <costello@mitre.org>

Prev by Date: Re: [xml-dev] A standard approach to glueing together reusable XML fragments in prose?
Next by Date: Re: [xml-dev] Are we ready for the namespace ID registry, yet? (was: RelaxNG question)
Previous by thread: XSLT 2.0 Processor = XSLT Processor + W3 XML Schema Validator
Index(es):
- Date
- Thread