xml-dev - Re: [xml-dev] Names As Types

Re: [xml-dev] Names As Types
[ Lists Home | Date Index | Thread Index ]
To: "XML Developers List" <xml-dev@lists.xml.org>
Subject: Re: [xml-dev] Names As Types
From: "Rick Jelliffe" <rjelliffe@allette.com.au>
Date: Thu, 1 Sep 2005 14:23:40 +1000 (EST)
Importance: Normal
In-reply-to: <4315FAEC.8020909@objfac.com>
References: <15725CF6AFE2F34DB8A5B4770B7334EE07207337@hq1.pcmail.ingr.com> <4315FAEC.8020909@objfac.com>
User-agent: SquirrelMail/1.4.2
Bob Foster said:
> I prefer to view the stack as an analogy to the OSI networking model,
> i.e., not just a conceptual model but also a service model. This makes
> it meaningful to speak in terms of a "layer 4 service", "layer 7
> service", etc. When you look at it this way, the obvious questions about
> each layer are:
>
> - does this layer provide a useful service to the layer it serves?
> - does this layer provide a meaningful application service (omitting the
> higher layers)?

I am not against the conceptual layers being directly useful. But what
I want to avoid is where there are external text entities and then
xml:include and these are handled by different layers. Merely describing
the upcoming status quo (IYKWIM) where we have all the layers implicit in
XML duplicated by external "DTD replacement" specifications (e.g. a layer
for
XML entities then later a layer for xmlk:include and so on) gives little
benefit.

Indeed, you might say that my stack (at the low levels) is a conceptual
stack when DTDs are in use but more like an operational stack when DTDs
are not used, and is a conceptual stack (at the high levels) when XSD
validation is being performed, but like more an operational stack when XSD
validation is not being performed.

> In that light, layers 5 and 6 need to be switched. XLink depends on
> namespaces, not just special names like xml:space.

I didn't have any special layer for doing namespace binding: it could
be part of any stage that needs it. Instead, I think the stack should
be more concerned with unifying and exposing the layers that make
*use* of standard  w3C infrastructure namespaces (e.g. xlink, xsi).

Layer 6 mentions "non-standard namespaces" because the standard ones
are dealt with by other layers. I have no objection to swapping
layer 5 and 6, except that I tend to think of property lists on
data as somehow being more primitive than property lists on container
objects. (The sequential numbers do not necessarily imply that
one layer is always build on its previous neighbour, but obviously
it would be gratuitous not to ignore that frequently they are.)

Another view may be that property lists on names (e.g. namespaces)
represent a distinct thing to property lists on
elements/attributes/PIs/comments, which in turn are distinct from
property lists (e.g. datatypes) on values. I.e. 3 rather than 2 layers
for those properrties.


> ID should be in the data augmentation layer.

IDs have three aspects. They use name syntax (property-augmented data
layer), they are unique (belonging to status layer??/), and they are link
ends (belonging to link layer). Because the layers represent
responsibilities rather than technologies, a fully explanation of the
stack would have xml:id in multiple places.

> I question whether an uncomposed tree is useful to any layer or that the
> composition layer should conceptually operate in terms of the tree
> layer, so I would switch layers 3 and 4, as well.

Syntax-aware editors operate at the text=data+markup level. My company's
editor for example.

As for switching layers 3 & 4, because of XML's WF rule we can indeed think
of entity resolution as an operation linking trees: as a form of linking
rather than as macro expansion on text. I have long held that XML should
build in the standard W3C/ISO entity sets as part of XML: even if as
default declarations: they belong at the parsing level (text ->
data+markup)

> Then, I question whether the tree layer with raw xmlns declarations is a
> desirable service. So I would move the augmented names layer before the
> tree layer. I can't see any reason not to put it there, as augmented
> names depend only on composed markup.



> xml:space="strip" seems to me to be a markup layer operation.

It could be. I would see the markup layer as being the raw parse:
acting at the delimiter and token level. Determining that a
piece of non-markup whitespace was significant belongs to a
subsequent layer.  But obviously there are many ways of cutting the
cake: I think we (err, I suppose I mean W3C in the first instance)
need a standard well thought out XML stack to allow all the little
bits they are developing to be glued onto mainstream usage: the
W3C Core WG has or had some kind of XML Processing Model task
in response to this problem.  (Liam, if you are reading this:
what is the current status of that work?)

I think a responsibility-orgranized stack model would be really
useful way to move forward.

For example, there is a lot of confusion about character encodings
in XML. You often hear "XML is broken because it barfs when encodings
are not properly labelled or used", which is like saying the human
digestive system is broken because people die when fed enough arsenic.
In my stack, the issue becomes much clearer: if you are dealing in
bytes then you must have an implementation of the layer 1, if you
are dealing in Unicode characters then layer 1 issues are irrelevant.

That is a difference between an operational model and a responsibility
stack, I suppose: if you view the stack as an operational model then
you think of starting at layer 1 and then proceding through all the
other layers until the end in sequence, perhaps memo-izing useful
things. In a responsibility kind of model, then
a document may only exist at a certain level (or a some levels):
to get a view of the document set at the Status Layer may involve
work (e.g. validation); similarly, to get a view of the document
at the byte level may also involve work (e.g. serialization to
a particular encoding). The layers are not just passive or
operational views, but may not be available IYSWIM.

Cheers
Rick Jelliffe

> BTW, no matter how in bed the implementations of TCP and IP are, IP is a
> distinct service that is used, for example, by UDP.

Sure, but as Internet pioneer Carl Malamud pointed out in his early 90s
book on layers and stacks (in which he posited that the internet was
really based on a 4 layer stack), TCP is traditionally tightly coupled to
IP to the extent that speaking of TCP as if it could be grafted on top
of some other layer is not productive.  XML has this issue to, I think:
the standard recommendations are couched in terms of an operation model
that is only used sometimes. Bytes->characters->infoset->DOM->PSVI or
whatever. Whereas, e.g. with databases, we might start with a PSVI view.

At a certain point, the sequential, parsing/validation operational model
of implicit  in the recommendations becomes a stumbling block. Now I am
not saying that this model should be abandoned for specs: it provides the
simplifying assumptions which has allowed the whole XML phenomenon to
flourish where so many other efforts have failed. (In the same way, the
database communities exclusion of data exchange and representation has
provided simplifying assumptions for them to make real standards progress.)

I think we have reached that point: something like XQuery (or its data
model) is utterly unrelated to the world of data exchange, encapsulated
best/minimal practise from the markup community and "SGML on the Web"
which informed XML's development. But both XML and XQuery can be well
understood using the kind of responsibility-stack model I suggest, and I
think
other specs would fit in better with XML and XQuery by being developed or
recast in terms of this kind of stack.
Next by Date: Re: [xml-dev] Re: Collection-Valued Subexpressions?
Next by thread: Re: [xml-dev] Re: Collection-Valued Subexpressions?
Index(es):
- Date
- Thread