[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Enlightenment via avoiding the T-word

From: "Fuchs, Matthew" <matthew.fuchs@commerceone.com>
To: 'Tim Bray' <tbray@textuality.com>, xml-dev@lists.xml.org
Date: Mon, 27 Aug 2001 18:48:40 -0700
Yes, we seem to be getting closer.  When we were trying to get the formal
description (XSFD) [1] started, there was simply no way the T-word could
bear all the overloading it got, so we ended up calling our things "sorts",
as per the logic community rather than T#&$s.  By the way, Tim, did you
reread the definitions of local and global in XSDL, so we're on the same
page as to what they are?

While I know that XSFD is not for everyone, an examination of the discussion
of normalized names near the beginning might be worthwhile.

[1] http://www.w3.org/TR/xmlschema-formal/

Many more comments below....
> -----Original Message-----
> From: Tim Bray [mailto:tbray@textuality.com]
> Sent: Monday, August 27, 2001 5:22 PM
> To: xml-dev@lists.xml.org
> Subject: RE: Enlightenment via avoiding the T-word
> 
> 
> At 03:34 PM 27/08/01 -0700, Fuchs, Matthew wrote:
> 
> Hey, I think Matt and I are actually communicating now!  See what
> happens when you banish the T-word?
> 
> At this point, a review of 
>  http://www.enm.bris.ac.uk/teaching/enbwp/emat107slides15.pdf
> may be helpful to those who've forgotten about "bijective"
> and "injective" and so on.  A bijective mapping between sets A 
> and B is one-to-one and onto, i.e. you can pair the elements 
> off and none them are in more than one pair, and every element 
> of both is in a pair.  An injective mapping is the same, only 
> you don't have to use all the elements of B.
> 
> >XSDL, however, is supposed to be a validation mechanism that supports
> >Namespaces, as DTDs don't.  However, just as DTDs provide a 
> mapping between
> >labels and definitions, one would hope that XSDL would 
> provide a bijection
> >between ulabels and definitions - at least for elements.  Of 
> course, this
> >wouldn't exhaust the semantics that could be applied to 
> structures, but at
> >least within the context of working with XSDL, and for the 
> benefit of people
> >trying to use XSDL for work, it is highly desireable.  In 
> other words, once
> >a document has been validated, every element should have a 
> ulabel, and from
> >the ulabel alone you should be able establish which 
> definition applies to
> >that element within the context of XSDL validation.
> 
> Now here I *think* we have the heart of the disagreement.  Matt's core
> point is that in schemas, unlike DTDs, the label or the ulabel as it 
> appears in the doc doesn't of itself give you the definition. 

Ulabels don't appear in the (source) doc.  They are synthesized at one level
or other of mapping.  But that's a quibble.

>  Question: 
> why is this a problem?  Over to Matt:
> 
> >  It just means that if people go through the (currently
> >not insignificant) effort to use XSDL, that's one of 
> important pieces of
> >information XSDL should give them.  And, ta da!, it doesn't 
> do that.  There
> >is no injective mapping from ulabels to definitions.  A 
> ulabel maps to a set
> >of definitions, and if you want to know which is the "true" 
> definition, you
> >need to either reparse the surrounding element(s) or hope 
> there's a PSVI
> >available (not insignificant overhead for many processors) 
> so you can do
> >pointer comparisons, or whatnot.  
> 
> So the problem isn't that you can't find the appropriate schema
> definition, it's that doing so can be expensive (I agree).  Since 
> nobody is arguing that it's a bad idea to have context-sensitive 
> content models, the problem is ensuring that you only have to look 
> up the context - the computationally expensive part - once.  Here's 
> a strawman "Plan B".  Suppose that as as a result of XSDL validation,
> you attached to each element in the input document an XPath 
> expression that points you to the applicable rule in the schema 
> document.  <pedantry>It's not an injection on the element labels, 
> it's an injection on the XPath expressions.</pedantry>
> 

_Very_ close!  My what a good student! ;-)  However,
1) XSDL does not require the existence of a single schema document, or even
of any schema document.  The schema is actually the set of components
describing the constructs of the schema, which doesn't necessarily ever
exist as a document.
2) The paths should be "canonical" in that we should be able to compare two
paths and determine they indicate the same thing, without actually needing
to check the document.

So, what I've actually been advocating instead are the normalized names (now
generally referred to as NUNs [normalized universal names]), which follow
the logical structure of the Schema, rather than the syntax.  Naturally they
represent a path, but a path in the logical tree.  XSDL currently has no
means to support these.

> >However, if local elements (as described
> >in XSDL) are not given ulabels (are unqualified), then you 
> still get an
> >injective mapping, and the hope of fixing things later (by 
> retrofitting in
> >the least disruptive way).
> 
> Huh?  An injective mapping on the combination of the label and 
> its parent (or some ancestor) ulabel, right?  But (pardon me being
> dense) I just don't see that the problem is made either worse or
> better by whether or not the local elements are in a namespace.
> Boy, this is making me feel stupid, my only consolation is the
> feeling that others are equally puzzled.  Do it in words of
> one syllable, Matt, and we'll eventually get it.
> 

Sorry.  In the DTD world and the Schema world w/o local elements, for each
valid element there is one and only one definition to which it corresponds.
Once it's ulabeled, the ulabel is like a pointer - I can use it to locate
any additional information I need.  And in addition, even if I don't
validate, the ulabels tell me what they should have been, so you can process
my valid document according to the labels I give even if you don't also
validate.  (That, of course, is the source of Murata Makoto's objection to
attribute defaults, etc.)  If I keep local elements out of a namespace (for
now), I still have this one-to-one map for everything that has a ulabel, and
if something doesn't have a ulabel, at least I know it's an exception as
soon as I see it and handle it accordingly.  If I put local element in the
schema namespace, I lose the one-to-one map for _everything_, so I always
need to check context.  What do I gain?

Now me, I want a ulabeling mechanism for XSDL that does the same thing,
including for local elements.  My objection to just using the path
expressions (or my NUNs) is that they require validation to appear, so it's
hard for you to process my document unless you validate.  I want a ulabeling
mechanism that allows me to tell you, as with namespaces, what the unique
ulabels are.  You can munge then all you want after that, as Simon wants to,
but at least you know what I meant.

Suppose we do build the right mechanism for XSDL.  What do we do with all
the existing schemas?  If we've gone and shoved all the local elements into
the schema namespace, updating to a better mechanism will be very difficult.
However, if we don't, we may be able to upgrade existing instances by
"reinterpeting" with upgraded semantics.

> >An appropriate ulabeling mechanism for XSDL would provide an 
> injective
> >mapping for its own purposes. 
> 
> Would the XPath trick qualify?  Because there are a *lot* of 
> philosophical problems with actually changing the element labels 
> to accomplish this objective.  Keepa yo hands offa my labels, geekboy!
> 

Remember, this is only in the context of XSDL and how XSDL users assign
labels.  And, of course, the label is just the NUN of the element from the
schema, so if you use schema, it would be the label _you_ created.  If we
consider XSDL complexT%^$s as creating namespaces, then they're namespace
_you_ created using the facilities of XSDL.

> >   The lack of an injective
> >mapping is, in my opinion, a major issue.  
> 
> It doesn't bother me at all... maybe a virtual show of hands
> around here would be useful at this point?  Do most people care?
> 

To start, people might actually try working with XSDL local types before
voting.  I realize that's not necessarily the American Way, but it might
help.  Arguments I have in favor include:

1) Otherwise there is no way to create a standalone document valid against a
schema with local names - i.e., it would extend well-formedness to handle
XSDL
2) From my experience, it will significantly simplify building tools that
operate on schemata, such as class generators, schema repositories, editors,
etc.  While most readers may never build such things, they'll probably want
to use some of them at some point, so there's no point in their going out of
the way to make those tools harder to build and use.

> >  And in the meantime, if someone tells you, as an XSDL user, to just
> >put all your elements in the schema namespace, just say no.
> 
> Put them all in *some* namespace, I say.  And don't rely on a
> schema processor to do it for you. -T

Knee jerk reactionary ;-)  It's not true that any namespace is better than
no namespace.  The right namespace is better than no namespace.

Matthew
Follow-Ups:
- Re: Enlightenment via avoiding the T-word
  - From: Rick Jelliffe <ricko@allette.com.au>
Prev by Date: Re: Enlightenment via avoiding the T-word
Next by Date: MSXML : XMLSchemaCache Object
Previous by thread: Re: Enlightenment via avoiding the T-word
Next by thread: Re: Enlightenment via avoiding the T-word
Index(es):
- Date
- Thread