Re: [xml-dev] Pragmatic namespaces

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
From: Amelia A Lewis <amyzing@talsever.com>
To: XML Developers List <xml-dev@lists.xml.org>
Date: Sun, 2 Aug 2009 21:44:51 -0400
Well, I'll offer some remarks in response, but I hope that others will 
join the conversation.

On Sun, 2 Aug 2009 12:30:58 -0700, Kurt Cagle wrote:
> However, concerning your post, I agree strongly with you about the need to
> avoid namespace registries, which is the danger that I see in any "default"
> mechanism. It's potential to fragment the web is disturbing, especially as
> it effectively puts the decision about what technologies to keep or avoid
> solely in the hands of the browser vendors.

Strongly agreed.  In that regard, I'm deeply reluctant to accept the 
"shortcuts" (registry) that Micah proposed, because it seems to me that 
these would soon become the only things supported.

Now ... even something like Flash, now so widespread, would have had no 
chance of adoption and uptake without *extensibility* (and I will 
perhaps be excused for emphasizing the word, since I am prouder of 
having worked for the company of that name than of any other).  I will 
grant that IE offers a poor platform for namespace-based (or 
equivalent) extensibility, but it seems to me that in order to enable 
the future of the web, to make it a place where small, dedicated groups 
can introduce something game-changing, that extensibility paradigm is 
of paramount importance.

> Overall, I'm going to raise this question again - what exactly is it about
> namespaces that the HTML crowd doesn't like? If it's the use of complex

I think it's verbosity as much as complexity.  You will note, I hope, 
the "namespace minimization" that I mentioned in my post; why should I 
have to tell the processor the *namespace prefix* of the element that 
I'm closing any more than I should have to repeat the attributes of 
that element?  I'm persuaded that permitting those to be dropped would 
have no impact on well-formedness (although I admit that discussions of 
minimization are likely to lead into a swamp, because well formedness 
and minimization are clearly at odds, in a large number of cases that 
can't be dismissed as "corner").  Arguably, XML's verbosity 
(effectively requiring the equivalent of a comment on every equivalent 
of a closing brace in C } /* if */ } /* while */) is precisely what 
makes it robust enough to have achieved the levels of adoption that it 
has seen.

In terms of verbosity, the idea of using something like XLink rather 
than "href" in attributes (XLink is, in my opinion, vastly 
overspecified/overengineered for the common case, leading to dismissal 
of the opportunities that it provides for more sophisticated usage) is 
equally damning.  And neither DTD nor W3C XML Schema make it easy or 
convenient to say "oh, you can use any attribute from a different 
namespace on any element here".  It's painful in DTD; it's exceedingly 
tedious (and consequently likely to not happen, for at least some 
elements) in WXS.

> namespace URIs, then frankly the ideal solution to that is to provide
> guidance on what constitutes a good web URI. If it's the requirement of

Okay, you've triggered a rant.  Those of you who are partisans of the 
Namespaces in XML Specification are hereby warned: the following will 
annoy you.  I'm going to be offensive (although more offensive to the 
W3C folks who forced URIs onto the Working Group than to the Working 
Group members themselves).

The Namespace in XML specification claims that an XML Namespace 
identifier "is a URI."  However, if you read the rules in the spec, you 
find that "" (the empty string) is permissible (with a special 
meaning), even though it is not legal in any URI specification BNF you 
care to present, and the statement isn't (specification-ly correct) 
"union of URI and the empty string".  And you find that one namespace 
identifier is compared with another via lexical comparison of the 
strings, which is not how one determines URI "identity" (an area 
admittedly underspecified, in my opinion, but the idea that 
"www.ibm.com" != "www.IBM.com" or "www.w3.org != "www.W3.org", when DNS 
is explicitly case-insensitive, is clearly problematic).

Consequently, in an earlier rant to XML-dev, I said that the Namespaces 
in XML specification might be improved by the addition of one word: "is 
a URI" becomes "is not a URI".  Yes, namespace identifiers mimic the 
syntax of URIs, but much of the information carried is discarded.

Can anyone provide an example of a namespace differentiated by scheme?  
I mean, for example:

http://www.example.com/namespace/x
ftp://www.example.com/namespace/y

... where two different namespaces are indicated and distinguished via 
the scheme portion of the URL.  I'd be surprised to see such a thing; 
although it's certainly possible, as soon as I thought of it, I 
labelled it, in my mind, as "exemplar of worst practice," and I suspect 
others would do so as well.  Similarly, distinguishing between 
hierarchical and non-hierarchical schemes is important for URIs, but 
not for namespaces; the indicators that distinguish authority from path 
from query are significant for URIs, but not for distinguishing among 
namespace identifiers produced by a single authority.  URIs contain 
lots of characters that aren't legal as NCName or QName; there's no 
reason that namespace identifiers, carrying equal information about 
authority and distinction of namespace, need do so.

Granted, though http:// (hierarchic HTTP) is the most common form of 
namespace identifier, there are others; the one I've most commonly 
encountered is urn:uuid: (non-hierarchic, but by-design unique, and not 
requiring that someone acquire a domain in order to define a 
namespace).  The latter form of URI does not lend itself to the pattern 
of automatic conversion that I suggested in my previous email, and 
arguably the requirement to have a domain in order to define a 
namespace is establishing a threshold (based on available capital) that 
should not be made.  *shrug*  I don't buy it (but I own several 
domains, personally, and I don't think that <$20/year is a hardship).

The insistence of the W3C on URIs (that aren't, in fact, URIs) as 
namespace identifiers is, to my mind, the worst thing that could have 
happened to XML.  Because the URI specifications are not in the control 
of W3C, and the BNF for URI (however widely ignored in detail) cannot 
drop multiple characters otherwise illegal in XML element names, the 
Namespaces in XML specification was forced to introduce 
namespace-to-prefix mappings, and the subsequent use of prefixes in 
element and attribute content poisoned the well completely.  James 
Clark's (brilliant) suggestion for expanded names, {uri}localname, 
simply never saw adequate adoption (in part, perhaps, because W3C XML 
Schema defined anyURI and NCName and QName, but not ExpName (or JCName 
:-)).

While the XML 1.0 specification is (a thing of beauty and a joy 
forever) one that I point out to others (whenever I am engaged in that 
horrible perversion, specificating, in company :-), the best that I can 
say for Namespaces in XML is "well, yes, that's clear enough; it can be 
implemented."  Whoever forced URIs on the working group--very likely 
TimBL or Roy Fielding or their disciples--did them a disservice.  XML 
namespace identifiers do not need that generality, and should have 
chosen a representation that allowed compact (but unique, with 
distributed authority) indication within an element name.  Perhaps, 
rather than the dotted-on pattern that Micah has proposed, they could 
have made "qualified names" use colons in place of dots in domain names 
and in place of slashes in paths (com:w3:org:1999:xlink:link).  Still 
verbose (Micah's "using" syntax is rather ... nice), but not as painful 
as NiXML.

The fundamental point there: the issue is one of making a distributed 
mechanism in which independent authorities can establish names (and 
namespaces) without the possibility of name clashes.  Consequently: 
scheme is unnecessary.  Authority *is* necessary, and it's best to 
leverage an *existing* registry, preferably one that anyone using 
computers at this level of proficiency can easily join.  Finally, make 
it possible to reference things in foreign namespace *without pain*.  
Micah's proposal does that better than I had thought of doing (because 
of his "threshold" elements, which work like the human brain: once I 
start talking MathML, I'm not done talking MathML until I close the 
element, so *leave me alone and don't make me repeat myself*).

> seems to be a good start, so long as there is a formal mechanism for
> insuring that ANY namespace can be introduced in this matter.

I agree that this is fundamental, and also worry (though I'm not 
familiar with HTML 5 or WHATWG people sufficiently to judge) that the 
intent of HTML 5's restriction of permitted namespaces is for the 
purpose of controlling (that is, limiting) extensibility.

> The language NEEDS an extension mechanism.
> The language NEEDS an extension mechanism.
> The language NEEDS an extension mechanism.

You won't mind if I quote this multiple times, will you?  I can't think 
of a better way to indicate how much importance I accord to it.

> The language NEEDS an extension mechanism.
> The language NEEDS an extension mechanism.
> The language NEEDS an extension mechanism.

See?

I really, really wish that someone from the HTML 5 working group would 
come forward with an indication of what, in the WG's opinion, the fatal 
flaws of XML namespaces are.  I can guess at a number of them (the fact 
that many, many people new to XML cannot understand that elements are 
scoped by the schema (and namespace-qualified) while attributes are 
scoped by the element (and hence unqualified) by default is one; 
verbosity, incomprehensibility ... well, I'm not a big fan of 
Namespaces in XML apart from the rather insipid encomium, "yes, that 
can be implemented"), but ... I'd like to see HTML 5's "non-XML" syntax 
permit a lossless transformation into the XML syntax and back.  It 
doesn't need *XML* namespaces to do that, but it does need ... 
something with the good qualities of Namespaces in XML.

> chooses to cede this point, then for all intents and purposes the XML
> movement is dead.

Oh ... I can't really agree with that.  I mean, I saw that the HTML 5 
working group was defined *in terms of the DOM* and Boggled and Fell 
Down.  Does anyone who does XML for a living have any respect for the 
DOM APIs?  And yet ... it's clear that those are core to the browser 
experience (which is why they suck so hard for any other usage, in my 
opinion), so it's really perfectly reasonable that the browser folks 
should start from the DOM.  In any other application of XML, a mutable 
tree API is at best a dead weight, but in the browser, it has 
utility--utility to the point of necessity.

Bifurcation?  Certainly.  HTML could become a non-XML dialect.  I'd 
hate that, but it looks as though there's at least a part of the HTML 5 
working group who would welcome it.  Killing XML?  Nah.  The HTML 5 
working group can miss an opportunity (and it seems likely that they 
will), but distancing HTML from XML won't kill either one, it will just 
annoy the folks who have to develop the techniques to reconcile them.

Amy!
-- 
Amelia A. Lewis                    amyzing {at} talsever.com
Do you ever feel like putting your fist through a window just so you
can feel something?
Follow-Ups:
- Re: [xml-dev] Pragmatic namespaces
  - From: Kurt Cagle <kurt.cagle@gmail.com>
References:
- Re: [xml-dev] Pragmatic namespaces
  - From: Amelia A Lewis <amyzing@talsever.com>
- Re: [xml-dev] Pragmatic namespaces
  - From: Kurt Cagle <kurt.cagle@gmail.com>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]