[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] My report on experiments with unused namespaces
- From: Amelia A Lewis <amyzing@talsever.com>
- To: Ramkumar Menon <ramkumar.menon@gmail.com>
- Date: Wed, 22 Sep 2010 00:00:44 -0400
On Tue, 21 Sep 2010 19:49:01 -0700, Ramkumar Menon wrote:
> for e.g. [
> <orderInfo xmlns:poid="1234" xmlns:description="sampleOrder"
> xmlns:numberOfItems="3"/>
>
> How Lovely!
> ]
<snicker />
Beautiful! Not only abuse of namespace declarations (but perfectly
legal, of course), but also abuse of the laxity of definition of URI
within W3C.
I can't imagine anyone actually doing this, can you? A parse/write
sequence could result in:
<orderInfo xmlns:ns0="sampleOrder" xmlns:ns1="3" xmlns:ns2="1234" />
or even:
<ns0:orderInfo xmlns:ns0="" xmlns="3" xmlns:ns1="1234"
xmlns:ns2="sampleOrder" />
which are all informationally equivalent, per Namespaces in XML
(because the prefix doesn't matter, in theory).
But it would have been more fun to have:
<orderInfo xmlns:poid="1234" xmlns:description="sampleOrder"
xmlns:noi="3">
<item noi:number="1">poid:23A12</item>
<item noi:number="2">poid:45B23</item>
<item noi:number="3">poid:67C98</item>
</orderInfo>
More abuse, more fun. This is nearly incomprehensible, even though I
just created it, but the idea is that the noi:number attribute
identifies a line item, and the QName in each element content (if it is
a QName ...) represents the item ID, which is thus strongly associated
with the purchase order ID previously defined.
To respond to your question, asking what the conclusion is: it seems to
me that Roger has created a term, "in-use namespace", which he wants to
define precisely. I probably ought to have stayed out of the
discussion; it isn't likely to matter to me how this neologism is
defined on this mailing list (it's unlikely to gain much currency, I
imagine). Having joined the discussion, I find myself unwilling to
accept too-facile a definition ... until, at last, in my previous post,
I find that I want to know the purpose of defining this term, "in-use
namespace."
We already have a number of terms defined around namespaces, and
they're sufficiently distressingly difficult to communicate to folks
with little experience that I find myself questioning whether "in-use"
is a concept that has utility. If the term is defined such that it
includes use in element and attribute names and in element and
attribute content, what have we gained? If the term is defined such
that it only includes use in element and attribute names, but not in
content, what have we gained?
If the broad definition is used, and we then encounter XML like this:
<doc xmlns:pi="http://joe.random.uri/">
<?importantPI pi:xyzzy=42 pi:plugh=2A ?>
[...]
</doc>
do we need a new, broader definition? One that includes
QNames-in-PI-content? What about QNames-in-comment-content?
The namespaces specification makes it clear how namespace-to-prefix
mappings must be declared and how they can then be used in element and
attribute names, and if an undeclared prefix is used as part of an
element or attribute QName, the specification makes this a
well-formedness error.
XSLT and XPath 1.0 (the latter because of the former) and W3C XML
Schema make use of QNames in content. So does WSDL. Schema defines a
"QName" primitive type (unrelated to string, uri, Name, or NMTOKEN ...
sorry, different rant (hi, Michael!)), which in a sense "blesses" the
use of QNames in content--but really, since the original XML
specification, and before it SGML, made use of NAME (or "NcName" once
you retcon it) as an attribute type, the blessing had already been
given, and QName simply transfers its efficacy to the updated target.
Notable in these use cases: QNames in content are used to identify
structure (for schema, it's hard to imagine a different pattern; one
can imagine using ID instead, perhaps, but _nom d'un nom d'un nom d'un
nom_...), for linking or manipulation. In schema, the initial use is
to allow references. XSLT, via embedded XPath, has little choice other
than to use QNames in paths; it must at least *permit* them as parts of
an XPath expression (in fact it mandates them, because the default
prefix is by definition bound to the global namespace, so elements and
attributes in a namespace *must* be specified using a prefix). XSLT
did not *have* to use the scope of namespace declarations--but it does
seem reasonable to do so, doesn't it? Some of the problems with QNames
in content could have been avoided by creating an XSLT xpath:bindings
attribute, perhaps, or something similar (list of pairs of prefix, uri,
to use in XPath expressions in this scope). Or, if Clark's "expanded
name" notation had been adopted for XPath, then there would have been
little need for QNames in content in XSLT ... but that would have been
awful in a different way; {uri}ncname is fine once or twice, but ever
more line-noise-like the more you use it. WSDL justifies its use in
the same way that Schema does; it's defining patterns for the exchange
of XML chunks, so wants to point at the chunks.
Back to Roger, if I may: Roger, I'm afraid that I don't see any value
in the "in-use" term for namespaces, which is why I'm being such a
pain. I can say that there are some best practices for namespace in
XML:
1) preserve prefixes (that is: though it is permitted by the
specification, do not discard namespace declarations and plan to
"fix-up" namespaces on write; use the prefix-to-namespace mappings
provided with the original document as received)
2) avoid QNames in content (that is: if you are defining a schema for a
new XML dialect, do the extra work to avoid using QNames in element and
attribute content if at all possible and reasonable; then conforming
instance documents are immune to violations of best practice #1, above)
3) (as Ram implicitly points out, above) use namespaces as designed,
not as an out-of-band information carrier
It also seems to me that a definition of "in-use namespace" is likely
to be intended to permit "namespace declaration minimization", but I
think that goal is a dangerous one. Rather than attempting to find
"unused" namespaces, preserve prefixes. Rather than attempting to
define what "unused" means (element, element and attribute names,
content), avoid QNames in content. When best practice 2 (avoid QNames
in content) is irrelevant (XPath, XSLT, Schema, WSDL, others I haven't
mentioned either due to laziness or because I haven't encountered them
myself, that are already defined using QNames in content), then:
preserve prefixes, and use namespaces as designed.
Amy!
--
Amelia A. Lewis amyzing {at} talsever.com
"...Tests are a gift. And great tests are a great gift. To fail the
test is a misfortune. But to refuse the test is to refuse the gift,
and something worse, more irrevocable, than misfortune."
-- Cordelia Naismith Vorkosigan
[Lois McMasters Bujold, "Shards of Honor"]
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]