Lists Home |
Date Index |
> The basic reason is that the DOM abstract schemas thingie really
> didn't meet anyone's requirements. The original idea was to find the
> intersection of what DTDs, XSD schemas, and other schemas We assumed
> at the time that the "other" would be XDR, but Microsoft seems to
> have been pretty diligent at stomping that out of the world's
> collective mindshare. As it turns out, the obvious "other" would be
> RELAX NG). The trouble is, nobody really wants such an API, or at
> least nobody made themselves known. We tried adding features to try
> to hit the 80/20 point in W3C XSDL capabilities, but that led to
> something that was too complex and ugly for the DTD and RELAX NG
> users, and nowhere near adequate for the XSDL power users who really
> wanted it supported in the DOM.
That's really interesting to me, because I think that XPath 2.0 has
precisely this problem as well. And I think that's part of why in some
ways I feel quite conflicted about it.
As an XSLT 1.0 user, I like the simplicity and elegance that
XPath/XSLT 1.0 already has, and I see XPath 2.0 as complex and ugly.
As a W3C XML Schema user, I'm frustrated by the inability to actually
get at all the "interesting" information that validation against a
schema adds to a document, and want XPath 2.0 to give me that
mechanism. But it's nowhere near adequate; it lacks:
- recognition of substitution groups
- a means of doing anything useful with identity constraints
- mechanisms for accessing meta-information, such as the default
value of an attribute or the enumerated values of a type
- the ability to access appinfo or documentation from the schema
and so on.
These requirements are just completely off the map for most XSLT users
(although I'd argue that they're actually much more useful for doing
the things that XSLT users want to do than all this data typing
stuff). It's quite strange having this split personality where they
both seem like reasonable requirements.
My fantasy XPath would be a generic, layered and modular so that the
basics are simple, elegant and easy to learn, but that extra
functionality could be plugged in for the implementations that want to
do more. The kinds of things that I'm thinking about are:
- modular function libraries
- modular axis libraries
- modular type libraries
- modular operation libraries
The problems with taking that approach are implementation overhead
(though just because these things are modular doesn't necessarily mean
that *users* would get to add things to them) and the issue of getting
the fundamentals flexible enough to be able to adapt to the range of
things that they'd need to adapt to, but concrete enough that the
whole approach doesn't spiral off into some abstract space where
nothing is known.
I think that the latter is the big problem. To give flexibility about
typing, for example, we could say that a value is a lexical
representation (a string) plus a label that says what type the value
is. But then RELAX NG, for example, often can't say whether a value is
of a particular type -- rather it says "it matches these types" -- so
that would imply that a value should be labelled with multiple types
(of equal importance). Resolving these different outlooks on what
typing *is* is a difficult thing. And this is just one example of the
So I'm not entirely convinced that this kind of approach is practical.
It kinda seems like it might me, when I dream of my fantasy XPath, but
then I read things like the DOM problems with supporting different
schema languages, and I think that probably I'm just falling into the
usual trap of making things so generic they end up being useless.
Mike: do you have any feeling about what the difficult areas were --
the real sticking points between the different views of documents that
validation against different schema languages gave?
My other thought is that if these two different requirements -- for
simplicity and for access to schema information -- really are
unresolvable, then we need two different languages. We need one simple
language for the XSLT community, for people who really don't care
about schema information, and another, more complex but more powerful
language for people who *do* care about schema information.
The way things are shaping up, we're going to have three languages:
XSLT 1.0 for people who really don't care about schema information,
and XSLT 2.0 and XQuery for people who do care about schema
information (well, about some schema information).
I think that XQuery has been really held back from really addressing
the community of schema users by the requirement to be simple, and
that XSLT 2.0 has been vastly complicated by the requirement to supply
schema information. I wonder whether it wouldn't be worthwhile making
this the dividing line between XSLT and XQuery, so that XSLT becomes
the language for those who don't care about schemas, and XQuery
becomes the language for those who do.