[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Are we losing out because of grammars?
- From: Rick Jelliffe <ricko@allette.com.au>
- To: xml-dev@lists.xml.org
- Date: Thu, 01 Feb 2001 22:08:31 +0800
From: James Clark <jjc@jclark.com>
> If I'm in an "x" element and I get a "y" element with a "z" attribute
> that is a legal lexical representation of an integer, I can't tell
> whether to type that attribute as an "xsd:integer" or an "xsd:string"
> unless I lookahead and see whether it's the last element "y" element in
> the "x". The TREX implementation works on a stream of SAX events, so
> this is a big complication.
So can we say that in TREX that a type is a path through the grammar?
> It depends how you restrict the grammar. If you restrict the grammar as
> much as W3C's schemas, type assignment is significantly simpler than
> validation (since I believe I am correct in saying that for W3C schemas
> the type of an element depends only on its name and the names of its
> parents).
For XML Schemas, it depends on what you call type. "Nullability" (or
whatever it is called, gawd don't get me started) is spoken as a "property"
of elements not of "types". And xsi:type can override the type to a
compatible one.
> There are many
> applications for which type-assignment is not necessary; I think
> dispatching on the "FQGI" (ie on the name of the element and the names
> of its ancestor elements) is sufficient for many applications.
I think there are three issues: one is how specifically we can identify
element or attributes in context, the second is what abstractions we use to
express them, the third is what side-effects the abstractions have. For
example, DTDs have just parent, sibling, group identification; these are
encoded using the grammar abstraction; the grammar abstraction forces us to
decide issues of order, belonging and parent-to-child occurrence even when
they are not relevant.
I wonder, given the existence, deployment and suitablility of James' XPath,
why we need to settle for "sufficient for many applications". Smart readers
of XML-DEV will of course say "oh, but probably you can express things in
content models that you cannot express in paths and rules" but I have my
doubts: a really complex content model is IMHO often (always) either the
sign of struggling against the grammar or a kind of tag ommission: if there
is some complex structure there, why isn't it explicitly labelled for all
the world to see?
Cheers
Rick Jelliffe