Lists Home |
Date Index |
> I need to be able to determine what the input tree will be for a
> given document. That means I must be able to turn off schema
> processing from the stylesheet so that I can get the tree as it is,
> not the schema adulterated version. I don't always have access to
> the document to remove schemalocation attributes, and even if I did
> it wouldn't be enough as a schema aware parser is allowed to apply a
> schema even if one is not specified, if it thinks it knows something
> about teh namespaces being used, or any other reason.
Do you think that it would make sense for the *only* validation to be
done on a document to be validation dictated by the stylesheet?
I think I do. We don't trust documents to come with the DTDs they
should come with, but we're quite happy for stylesheets to
include/import other stylesheet modules, and don't worry about the
fact that if they were missing the stylesheet as a whole wouldn't run
properly -- they're all part of the same application, so naturally
they will be kept together. If DTDs and schemas were likewise seen as
part of the stylesheet application, we could relax and let the DTD or
schema supply default values, typing information and so on.
Perhaps somewhere we could have an attribute to indicate the level of
validation that you required, like the wildcards in XML Schema --
'skip' if you just want to use the type definitions from the schema,
'strict' if you wanted it to be an error if the instance document
didn't comply with the schema or DTD, or 'lax' if you wanted the
document validated, but were happy with a partial validation.
It would make a lot more sense to have this dictated within the
stylesheet, which then governs the way in which the source document is
parsed (whether validation is carried out; what happens as a result of
validation errors) rather than by the source document.
On the other hand, if you have a generic stylesheet, where you don't
know what the schema for a particular instance document should be,
then you might want to try to validate using whatever information is
supplied by the instance document itself. I think that this should be
an option; maybe default behaviour for compatibility with XSLT 1.0.
The issues as I see them are:
- how you get stylesheets/queries to support DTDs, XML Schema
schemas and other types of schemas that might be used to provide a
PSVI from an instance document, such that implementations are free
to support whatever schema languages they want but it's still
possible to write portable stylesheets/queries
- how you manage which schemas and DTDs are used against which
(parts of) the instance document
- whether you ever need to get hold of the PSVI for temporary trees
within the stylesheet
- how you cope with a situation in which neither the source nor
result has a namespace, they are different markup languages, but
have a simple type of the same name with different semantics
Perhaps an XSLT version of the 'validate' expression *would* be
useful... an xsl:apply-schemas instruction that returns validated
copies of a sequence of nodes. Perhaps you could associate names with
different (groups of) schemas so that you can point to particular ones
to indicate the type that you're after from within them, with the
default (unnamed) schema(s) being used against the source document.