[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Are we losing out because of grammars?
- From: Rick Jelliffe <firstname.lastname@example.org>
- To: email@example.com
- Date: Thu, 01 Feb 2001 22:08:31 +0800
From: James Clark <firstname.lastname@example.org>
> If I'm in an "x" element and I get a "y" element with a "z" attribute
> that is a legal lexical representation of an integer, I can't tell
> whether to type that attribute as an "xsd:integer" or an "xsd:string"
> unless I lookahead and see whether it's the last element "y" element in
> the "x". The TREX implementation works on a stream of SAX events, so
> this is a big complication.
So can we say that in TREX that a type is a path through the grammar?
> It depends how you restrict the grammar. If you restrict the grammar as
> much as W3C's schemas, type assignment is significantly simpler than
> validation (since I believe I am correct in saying that for W3C schemas
> the type of an element depends only on its name and the names of its
For XML Schemas, it depends on what you call type. "Nullability" (or
whatever it is called, gawd don't get me started) is spoken as a "property"
of elements not of "types". And xsi:type can override the type to a
> There are many
> applications for which type-assignment is not necessary; I think
> dispatching on the "FQGI" (ie on the name of the element and the names
> of its ancestor elements) is sufficient for many applications.
I think there are three issues: one is how specifically we can identify
element or attributes in context, the second is what abstractions we use to
express them, the third is what side-effects the abstractions have. For
example, DTDs have just parent, sibling, group identification; these are
encoded using the grammar abstraction; the grammar abstraction forces us to
decide issues of order, belonging and parent-to-child occurrence even when
they are not relevant.
I wonder, given the existence, deployment and suitablility of James' XPath,
why we need to settle for "sufficient for many applications". Smart readers
of XML-DEV will of course say "oh, but probably you can express things in
content models that you cannot express in paths and rules" but I have my
doubts: a really complex content model is IMHO often (always) either the
sign of struggling against the grammar or a kind of tag ommission: if there
is some complex structure there, why isn't it explicitly labelled for all
the world to see?