OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Datatypes vs anarchy

> -----Original Message-----

> From: David E. Cleary [mailto:davec@progress.com]
> Sent: Tuesday, March 13, 2001 5:04 PM
> To: Michael Champion; xml-dev
> Subject: RE: Datatypes vs anarchy

> If you are writing a DOM level 3 parser, then you must worry
> about this stuff. Otherwise, stick to level 2. If you are a consumer of
> XML and do not care about schemas, don't use them, just like people who do
> not care about DTDs do not use them.

In my very humble but biased opinion, DOM Level 3 is doing the "right thing"
in that the content models and validation component is a separate, optional
module. Implementers can choose to ignore it, and consumers can test for its
support and either use it, or not use it. Nothing in the rest of the DOM
depends on the CM module. It is not easy to carve out independent API
modules from the alleged "layers" of XML-related specs, but the DOM WG tries
very hard to do so.

In the W3C Schema spec, data types are an integral part of the overall spec
(the "part 1" and "part 2" distinction does not reflect any explicit
modularization). The PSVI is not (as far as I can find) explicitly, much
less modularly, defined. This means that XPath 2.0, XSLT 2.0 and XQuery
depend on the Schema spec as a whole rather than being layered on the
PSVI/datatypes. It is true that a consumer of XPath 2.0/XSLT 2.0 need not
actually define a schema to write queries or stylesheets, but (as near as
one can tell from the Requirements so far), implementers and explainers of
XPath 2.0 will have to incorporate implementations/explanations of XML,
namespaces, the PSVI, and the W3C Schema datatypes before even getting
around to XPath. The opportunities to confuse the uninitiated here are
enormous, exponentially more so than in the good ol' days of XML 1.0's
well-formed/validating distinction.

> How are they entangled in XML? From my vantage point, they are layered.
> You have well formed, valid, and schema valid.

While I'm on a rant ... we don't simply have
"well-formed/valid/schema-valid" layers of XML processing. We have (and this
is not at all an exhaustive list) well-formed, well-formed-DTD-aware,
DTD-validating, namespace-aware, namespace-aware-DTD-aware, (but NOT
namespace-aware-DTD-validating), schema-validating (implies
namespace-aware), schema-validating-DTD-aware, and (God help us)
DTD-validating-schema-validating ... Then we have the fact the the InfoSet
is really just a common vocabulary for implementers, and that the DOM,
Xpath/XSLT, and XQuery all implement their own flavor of the thing. I'm
sorry, but I'll bet it would be hard to find a rational person coming in
from the outside, trying to make sense out of this, who wouldn't use some
synonym for "entangled" to describe it. And then there are the (ahem)
"colorful" descriptions of all this elegance over at xmlbastard.com, if you
think that *I* am being a bit unfair :~)

The DOM is far from perfect, but uses two principles that the rest of the
W3C doesn't seem to take very seriously: Levels of the Recommendation come
out when there is "minimal progress to declare victory" and the spec is
built out of independent modules.

Needless to say, this is my own rant; don't hold it against my DOM
colleagues or my employer.