XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] My report on experiments with unused namespaces

On Tue, 21 Sep 2010 19:49:01 -0700, Ramkumar Menon wrote:
> for e.g. [
> <orderInfo xmlns:poid="1234" xmlns:description="sampleOrder"
> xmlns:numberOfItems="3"/>
> 
> How Lovely!
> ]

<snicker />

Beautiful!  Not only abuse of namespace declarations (but perfectly 
legal, of course), but also abuse of the laxity of definition of URI 
within W3C.

I can't imagine anyone actually doing this, can you?  A parse/write 
sequence could result in:

<orderInfo xmlns:ns0="sampleOrder" xmlns:ns1="3" xmlns:ns2="1234" />

or even:

<ns0:orderInfo xmlns:ns0="" xmlns="3" xmlns:ns1="1234" 
xmlns:ns2="sampleOrder" />

which are all informationally equivalent, per Namespaces in XML 
(because the prefix doesn't matter, in theory).

But it would have been more fun to have:

<orderInfo xmlns:poid="1234" xmlns:description="sampleOrder" 
xmlns:noi="3">
  <item noi:number="1">poid:23A12</item>
  <item noi:number="2">poid:45B23</item>
  <item noi:number="3">poid:67C98</item>
</orderInfo>

More abuse, more fun.  This is nearly incomprehensible, even though I 
just created it, but the idea is that the noi:number attribute 
identifies a line item, and the QName in each element content (if it is 
a QName ...) represents the item ID, which is thus strongly associated 
with the purchase order ID previously defined.

To respond to your question, asking what the conclusion is: it seems to 
me that Roger has created a term, "in-use namespace", which he wants to 
define precisely.  I probably ought to have stayed out of the 
discussion; it isn't likely to matter to me how this neologism is 
defined on this mailing list (it's unlikely to gain much currency, I 
imagine).  Having joined the discussion, I find myself unwilling to 
accept too-facile a definition ... until, at last, in my previous post, 
I find that I want to know the purpose of defining this term, "in-use 
namespace."

We already have a number of terms defined around namespaces, and 
they're sufficiently distressingly difficult to communicate to folks 
with little experience that I find myself questioning whether "in-use" 
is a concept that has utility.  If the term is defined such that it 
includes use in element and attribute names and in element and 
attribute content, what have we gained?  If the term is defined such 
that it only includes use in element and attribute names, but not in 
content, what have we gained?

If the broad definition is used, and we then encounter XML like this:

<doc xmlns:pi="http://joe.random.uri/";>
<?importantPI pi:xyzzy=42 pi:plugh=2A ?>
 [...]
</doc>

do we need a new, broader definition?  One that includes 
QNames-in-PI-content?  What about QNames-in-comment-content?

The namespaces specification makes it clear how namespace-to-prefix 
mappings must be declared and how they can then be used in element and 
attribute names, and if an undeclared prefix is used as part of an 
element or attribute QName, the specification makes this a 
well-formedness error.

XSLT and XPath 1.0 (the latter because of the former) and W3C XML 
Schema make use of QNames in content.  So does WSDL.  Schema defines a 
"QName" primitive type (unrelated to string, uri, Name, or NMTOKEN ... 
sorry, different rant (hi, Michael!)), which in a sense "blesses" the 
use of QNames in content--but really, since the original XML 
specification, and before it SGML, made use of NAME (or "NcName" once 
you retcon it) as an attribute type, the blessing had already been 
given, and QName simply transfers its efficacy to the updated target.

Notable in these use cases: QNames in content are used to identify 
structure (for schema, it's hard to imagine a different pattern; one 
can imagine using ID instead, perhaps, but _nom d'un nom d'un nom d'un 
nom_...), for linking or manipulation.  In schema, the initial use is 
to allow references.  XSLT, via embedded XPath, has little choice other 
than to use QNames in paths; it must at least *permit* them as parts of 
an XPath expression (in fact it mandates them, because the default 
prefix is by definition bound to the global namespace, so elements and 
attributes in a namespace *must* be specified using a prefix).  XSLT 
did not *have* to use the scope of namespace declarations--but it does 
seem reasonable to do so, doesn't it?  Some of the problems with QNames 
in content could have been avoided by creating an XSLT xpath:bindings 
attribute, perhaps, or something similar (list of pairs of prefix, uri, 
to use in XPath expressions in this scope).  Or, if Clark's "expanded 
name" notation had been adopted for XPath, then there would have been 
little need for QNames in content in XSLT ... but that would have been 
awful in a different way; {uri}ncname is fine once or twice, but ever 
more line-noise-like the more you use it.  WSDL justifies its use in 
the same way that Schema does; it's defining patterns for the exchange 
of XML chunks, so wants to point at the chunks.

Back to Roger, if I may: Roger, I'm afraid that I don't see any value 
in the "in-use" term for namespaces, which is why I'm being such a 
pain.  I can say that there are some best practices for namespace in 
XML:

1) preserve prefixes (that is: though it is permitted by the 
specification, do not discard namespace declarations and plan to 
"fix-up" namespaces on write; use the prefix-to-namespace mappings 
provided with the original document as received)

2) avoid QNames in content (that is: if you are defining a schema for a 
new XML dialect, do the extra work to avoid using QNames in element and 
attribute content if at all possible and reasonable; then conforming 
instance documents are immune to violations of best practice #1, above)

3) (as Ram implicitly points out, above) use namespaces as designed, 
not as an out-of-band information carrier

It also seems to me that a definition of "in-use namespace" is likely 
to be intended to permit "namespace declaration minimization", but I 
think that goal is a dangerous one.  Rather than attempting to find 
"unused" namespaces, preserve prefixes.  Rather than attempting to 
define what "unused" means (element, element and attribute names, 
content), avoid QNames in content.  When best practice 2 (avoid QNames 
in content) is irrelevant (XPath, XSLT, Schema, WSDL, others I haven't 
mentioned either due to laziness or because I haven't encountered them 
myself, that are already defined using QNames in content), then: 
preserve prefixes, and use namespaces as designed.

Amy!
-- 
Amelia A. Lewis                    amyzing {at} talsever.com
"...Tests are a gift.  And great tests are a great gift.  To fail the
test is a misfortune.  But to refuse the test is to refuse the gift,
and something worse, more irrevocable, than misfortune."
        -- Cordelia Naismith Vorkosigan 
           [Lois McMasters Bujold, "Shards of Honor"]


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS