OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Namespaces, schemas, Simon's filters.



Tim,

> -----Original Message-----
> From: Tim Bray [mailto:tbray@textuality.com]
> Sent: Sunday, August 19, 2001 12:48 PM
> To: Xml-Dev
> Subject: Namespaces, schemas, Simon's filters.
> 
> 
> Having spent some time reading this thread, I realized I 
> didn't understand either local types or Simon's filters.  
> As Ron Bourret said:
> 
> >> I think the real culprit here isn't whether local element 
> >> type names are
> >> qualified or not. It's that local element types exist at all. 
> 
> Upon further study, I think Ron's right.  For those who haven't 
> already, some study of 
> 
>   http://www.w3.org/TR/xmlschema-0/#NS
> 
> is pretty essential.   Pardon me for offering an explanation of
> something that will be painfully obvious to the students and 
> authors of XSD around this list, but *if* I understand the 
> motivation, it's this: people want to use schema X to validate 
> element Y, but they don't want element Y to be in a namespace 
> (even a defaulted namespace), they want the connection picked up 
> from the namespace of an ancestor element, and "local element 
> types" allow this.
> 

Uh, no, although that might be a tortured consequence.  The motivation is
the same as for any data definition language that's arisen in the last 40
years that has locally scoped names - the use of a particular name in a
particular context should not pollute the global context, but allow the same
name to be used for different things in different contexts.  Which, at the
semantic level, is exactly what namespaces were introduced for - let's not
confuse the goal with the particular syntax introduced to handle it.

So, to contrive an example beyond foo and bar (with the obvious danger that
people will spend their time trying to pick apart the example, rather than
use it to understand the goal), suppose I want to create a Schema to
coordinate graphics, music, and text.  So I create a <graphic> element, a
<music> element, and a <text> element and I have three different people
working on designing each of these elements.  Now, my graphics designer
decides he wants to have a <line> element to describe lines that will be
drawn.  Similarly, my musician decides he wants a <line> element to describe
a line of music (perhaps strophe would be better, but then perhaps I've
chosen one as ignorant of the subject matter as myself).  And, of course, my
writer also wants a <line> element for the line of text to be spoken at a
particular point.  With locally scoped elements (as in any modern
programming language), they can each do this without affecting their two
compatriots in any way.  In premodern languages, like Fortran and SGML (or
XML 1.0 pre namespaces), if anyone wants to create a <line> element it
impacts everyone else working on the schema.  There could even be a totally
unrelated <line> element at the global level which neither affects, nor is
affect by, the locally scoped ones.

Part of what motivated this was endless declarations in DTDs for
<graphic.line>, <music.line>, <text.line>, and so on.  It got really
annoying when it was something like
<purchaseorder.lineitem.description.graphic.alternativetext>, or worse, as
the crude mechanisms available were trying to be used to enforce something
really simple.  (Note that we could have solved the namespace problem the
same way by requiring everyone to use names like
<hypertext_markup_language.p> to distinguish it from <dublin_core.p> or
whatever.)

Namespaces are simply the only syntactic mechanism we have to handle this.
One solution would be as follows:

- suppose the current schema looks like:

  <schema targetNamespace = "foo">
    <element name="music">
     <complexType>
      <sequence>
        <element name="line"...
     ...
    <element name="text">
     <complexType>
      <sequence>
        <element name="line"...

with locally scoped elements.

- then we could also break this into lots of pieces:

  <schema targetNamespace = "foo#music">
    <element name="line" .... final="true"/>
    <complexType name="music">
      <sequence>
        <element ref="line"...
    ...
  </schema>

  <schema targetNamespace = "foo#text">
    <element name="line" .... final="true"/>
    <complexType name="text">
      <sequence>
        <element ref="line"...
    ...
  </schema>
  <schema targetNamespace = "foo"
          xmlns:music="foo#music"
          xmlns:text="foo#text">
    <import targetNamespace = "music"/>
    <import targetNamespace = "text"/>
    <element name="music" type="music:music"/>
    <element name="text" type="text:text"/>
    ...
  </schema>

with elements in different namespaces, but the ability to have multiple
<line> elements.  

However, the instances would need to look like
<music><music:line>...</music:line></music>.  The _goal_ of local elements
is to support this degree of differentiation among <line> elements without
needing to explicitly break a schema into a huge number of little schemas,
each establishing individual namespaces.  Yes it can be misconstrued as
"syntactic sugar", but everything in programming is syntactic sugar on
machine language, anyway.

> At one level this seems like a reasonable thing to want to do: 
> "please use the following rules to validate no-namespace 
> elements whose type is Y and whose ancestor is myNS:Z."

As I hope you can see by now, the goal is "please recognize which elements
are locally scoped and process them within the semantics provided by their
enclosing global element"

> On the other hand, it does contravene the achingly-simple
> procedure for linking markup to software provided by 
> XML+namespaces: identify markup to software by putting it
> in a namespace.  Which feels pretty serious.  The default, 
> simple, obvious way of arranging for software module X to 
> process element Y is to advertise that X processes elements 
> which are in the NSx namespace.  

I hope you can see by now that this doesn't contavene this procedure at all,
which we've been using at Commerce One, in the presence of local names,
since the last millenium.  

It's kind of troublign that 
> schema supports a non-interoperable, more complex, less 
> robust way of tying markup to software.  That this is its 
> default behavior is simply outrageous.  As it stands with 
> XSD, by default the way that schemas are tied to certain 
> classes of elements is non-interoperable with the way that 
> other software does it.  

Actually, the default behavior is _more_ interoperable than the alternative.
With the default behavior (assuming your schemas have namespaces), _all_
globally declared elements are in a namespace - only locally declared names
are not.  In my example above, <music>, <text>, and <graphic> are all in the
foo namespace, and if software module X says it processes elements in the
foo namespace, each of those will be handed to X when it shows up.  Now
suppose X is handed 
<foo:graphic><line>..</line></foo:graphic>.  Now how will X behave?

1) it understands the structure of <foo:graphic> and therefore knows how to
process it's local elements - which it recognizes (in the default) by their
being unqualified.
2) if done by someone who understands principles of OO, X will dispatch the
<foo:graphic> element to some object that knows how to process it and
actually expects an unqualified <line>.
3) it doesn't necessarily understand <foo:graphic> separate from anything
else, but it knows how to handle locally scoped elements for elements in the
foo namespace and hence has no problem.
4) it doesn't understand how to process locally scoped elements, so when it
sees the unqualified line, it knows to throw an error or perform some other
appropriate action.

The alternative setting - qualifying the local elements - would make the
following <foo:graphic><foo:line>..</foo:line></foo:graphic> and would
actually _require_ X to deeply understand the structure of <foo:graphic>
because when it would see <foo:line> it would either _know_ it was a local
element or it would have to assume it belonged to the global foo:line in
namespace X (if such existed) and then barf.

So you see that the default setting is, to the greatest degree possible,
upwardly compatible with current practice.  Another approach is to compare
local elements with local attributes.  Local attributes are not in a
namespace, so local elements should not be either.

> I REALLY HOPE that my understanding
> of the motivation and effect is wrong and ask that someone
> more schema-savvy explain why this isn't as awful as it 
> looks.
> 

I hope this has improved your understanding of this feature.  I hope the
debate on local names will be less acrimonious than the debate on namespaces
was.

> On the other hand, I (and a lot of other people) declined
> to take part in the schema effort, and the default 
> assumption is that we have to live with it as specified.
> 

You, as someone who could perhaps have helped us develop a better product
faster, should probably be so punished.  However, more generally, you need
to learn what's going on so your (possibly correct) criticism is useful.

> Now, suppose I'm right.  If so, why are Simon's filters 
> ever a good idea?  The XML Schema Rec allows me to write
> rules so that in the following
> 
> <apo:purchaseOrder xmlns:apo="http://www.example.com/PO1"
>                    orderDate="1999-10-20">
>  <shipTo country="US">
>   <name>Alice Smith</name>
>   <street>123 Maple Street</street>
>   <!-- etc. -->
> 
> the "apo" schema is used to validate the <shipTo> element.
> Nowhere does it say that the "shipTo" element is or should
> be in the "apo" namespace.  Applying Simon's filter will
> put "shipTo" in the apo namespace.  This behavior totally
> flies in the face of XML+Namespaces as specified.  Also, if
> I read schemas right, it also won't schema-validate any more.  
> So why would this ever be a good idea?  -Tim

The truth is that W3C still needs to revisit namespaces, especially in the
context of XML Schema.  Part of the problem surrounding local names is that
the mechanisms currently provided don't support them very well.  But that
can be fixed.

By the way, take a look at the Schema Formal Description
(http://www.w3.org/TR/xmlschema-formal/) for a sufficient naming mechanism.

Matthew