[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Namespaces, schemas, Simon's filters.

From: "Fuchs, Matthew" <matthew.fuchs@commerceone.com>
To: 'Tim Bray' <tbray@textuality.com>, xml-dev@lists.xml.org
Date: Fri, 24 Aug 2001 12:59:03 -0700
Tim,

I generally agree with your argument here wholeheartedly, and hope we can
get to a situation that is as you describe.  We are not there with local
elements, but we can get there.  Just as namespaces required a little extra
machinery, so do local elements, and until we get there, I think it is
unwise to aggressively apply our current machinery in ways that may be both
deceptive and probably not easily backwards compatible with an appropriate
solution.  So, for example, the XML 1.0 spec was changed at the last minute
to allow colons in names because we knew namespaces were coming, and there
was a general consensus (opposed by many, some who have since recanted :))
that it would need a new mechanism, and could not simply be shoe-horned into
what was there (you undoubtedly remember all the arguments for adapting
architectural forms).  

So another aspect of my position is what I'd call the Hippocratic oath of
the specification designer - when in doubt, try to pick the solution which
leaves the most possibilities for later, better informed, evolution (this
oath was obviously never administered to the Schema WG).  The current
namespace mechanism, as provided by the XSDL and the NS rec, is, it seems
quite clear to me, insufficient for supporting local elements.  There will
need to be a resolution to this, either in a future version of XSDL, or in a
revision of the NS rec, or both.  In the meantime, in order to do the least
harm to future solutions, I advocate completely excising local elements from
the namespace mechanism by applying Andrew Layman's Wittgensteinian
interpretation of unqualified as meaning out of the scope of the namespace
rec. entirely.  This has two beneficial aspects:

1) Given an instance document with local elements that "makes sense" under
the default rules (which covers any instance that doesn't use
noNamespaceSchemaLocation) we should be able to retrofit any eventual
solution by tweaking the default namespaces in the document - which might
not even require touching the document.  If we aggressively put them in the
namespace of the surrounding schema and then decide that wasn't a great
idea, the changes are much greater, as we'd need to munge all the prefixes
or add new, overriding namespace attributes everywhere.

2) As I agree with A1 and A2, I'd hate to contaminate those things that are
"unambiguously and completely labeled with minimal reliance on context" with
those things that are not.  Under the default rules, only those things which
can be "unambiguously and completely labeled with minimal reliance on
context" by the current namespace mechanism are in a namespace.  Those
things which cannot be so labeled are not in a namespace.  So, in the second
example below (and I'd note that your examples are also excellent arguments
against ever using default namespaces, as those also rely on context and can
get totally screwed up if someone slips in a element that changes the
default namespace), let us distinguish the two cases of:

  i) <myNS:x>  <y id="1340975"> .. </y>   </myNS:x>
 ii) <myNS:x>  <myNS:y id="1340975"> .. </myNS:y>   </myNS:x>

where someone references the y by id.  

You claim, for case i that "Software processing such a [link] is not in
general going to have any way of knowing that this element really needs to
be processed in the context of the containing <myNS:x>."  However, under the
default rules for local elements, it is clear that y cannot be a global
element because it is not qualified, and therefore some contextual
information is required to uniquely identify it (which the software may or
may not be able to do - but at least it knows).  For case ii, it is clear
that myNS:y is a global element - the one element named "y" in the myNS
namespace and therefore can be unambiguously located without context
information.  So if you have an existing system which relies on namespaces
to identify elements, it already knows when it is out of its league - it
breaks reliably.  If you have a system which knows about local elements,
then it can immediately distinguish the two cases.

If, instead, we aggressively put local elements in the schema namespace,
then only case ii occurs.  Existing software that doesn't know about local
elements will see myNS:y and will assume it must be global, because that is
the only thing they know about, and will break if it turns out to be local,
or will give you the wrong element if the same local name is reused in the
schema, or just give the wrong error - it breaks _unreliably_.  Software
that is aware of local elements, however, would always need to look at
context anyway, because just seeing myNS:y doesn't tell you which "y"
element it is, so you need to assume the more complex case, and would always
need to look at the enclosing "myNS:x".  

If you look at one of my early emails on this topic [1], I advocate, at the
very least, removing this issue from the schema and allowing the instance
author tools to "unabiguously" identify what is happening.

So, my general advice to developers is:
1) Understand what local elements really are
2) Use them wisely with the defaults
3) If you like them, loudly demand that the W3C provide a better solution

Matthew

[1] http://lists.xml.org/archives/xml-dev/200108/msg00661.html

> -----Original Message-----
> From: Tim Bray [mailto:tbray@textuality.com]
> Sent: Thursday, August 23, 2001 1:46 PM
> To: xml-dev@lists.xml.org
> Subject: RE: Namespaces, schemas, Simon's filters.
> 
> 
> At 10:56 AM 23/08/01 -0700, Fuchs, Matthew wrote:
> 
> >Actually, while I've argued as to why making local elements 
> unqualified is a
> >good thing from the point of view of what local elements 
> are, no one has
> >given a similar argument for why local elements should be 
> qualified.  
> 
> Let's ask two questions:
> 
> Q1: Why would you use XML?  
> A1: One of the important reasons is so that you can re-use data for
> purposes other than those envisioned by its creator.  This is why,
> in the document space, XML is an unqualifiedly better storage 
> format than MS Word, Frame, PDF, or any other proprietary binary 
> display-oriented format.  A lot of the XML-as-serialized-objects
> people probably don't care that much about this, but I think 
> they're missing an important boat.  Computers are important 
> because they are *general-purpose* machines, and to the extent 
> that you can make data general-purpose as well, you win.
> 
> Q2: Why would you use namespaces?
> A2: One of the important reasons is so you can pull together data 
> objects from multiple sources without losing track of where the
> pieces come from.
> 
> If you believe A1 and A2, then it seems to me like you get 
> maximum re-usability and ability to mix-n-match if you've got
> everything unambiguously and completely labeled with minimal
> reliance on context.  
> 
> Let's make this concrete.  Suppose you have
> 
> <myNS:x>  <y> .. </y>   </myNS:x>
> 
> and you're counting on the software that processes <myNS:x> to 
> know how to deal with a child <y>.  If someone else comes
> along and slips in 
> 
>  <myNS:x> <html:a href="somewhere"> <y>...</y> </html:a> </myNS:x>
> 
> then the <y> looks a little lost, and 
> 
>  <myNS:x><html:a href="somewhere"><myNS:y>...</myNS:y>
>         </html:a></myNS:x>
> 
> has a better chance of being processed by myNS-sensitive software
> in a reasonable way downstream.
> 
> Also, suppose you have 
> 
> <myNS:x>  <y id="1340975"> .. </y>   </myNS:x>
> 
> and that a hyperlink from somewhere entirely outside points at 
> <y> by the ID value.  Software processing such a is not 
> in general going to have any way of knowing that this element really 
> needs to be processed in the context of the containing <myNS:x>.
> 
> I think that the above are good and reasonable things to do, and
> one of the Really Good Things about XML is that it opens the
> door to these kinds of practice.   I think that markup designed to be 
> robust in the face of general enrichment, manipulation, and
> hypertext is better than markup that isn't.   At a deep level.
> 
>  -Tim
> 
> PS: And since the title line mentions Simon's filters, they look
> to me like well-done software, but I continue to believe that
> they should never be used, since they (a) fly in the face of
> the author's intent as regards namespaces, and (b) they break
> schema validation.
> 
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> 
> The list archives are at http://lists.xml.org/archives/xml-dev/
> 
> To subscribe or unsubscribe from this elist use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>
>
Follow-Ups:
- RE: Namespaces, schemas, Simon's filters.
  - From: Tim Bray <tbray@textuality.com>
- RE: Namespaces, schemas, Simon's filters.
  - From: Tim Bray <tbray@textuality.com>
Prev by Date: Dumb XSLT question
Next by Date: RE: Dumb XSLT question
Previous by thread: Re: Namespaces, schemas, Simon's filters.
Next by thread: RE: Namespaces, schemas, Simon's filters.
Index(es):
- Date
- Thread