Re: [xml-dev] Pragmatic namespaces

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
From: Henri Sivonen <hsivonen@iki.fi>
To: XML Developers List <xml-dev@lists.xml.org>
Date: Thu, 13 Aug 2009 21:47:39 +0300
On Aug 1, 2009, at 02:06, Micah Dubinko wrote:

> Literally for years, people have been talking about how great it  
> would be to use something like Java-style namespaces in XML instead  
> of the current xmlns regime. For example <http://www.xml.com/pub/a/2005/04/13/namespace-uris.html 
> > .

I'm glad to see that here over in the XML land, people who've worked  
with Namespaces show appropriate discontent with them. I wish the RDFa  
land took note.

> Requirement: this solution must not interfere with existing HTML  
> elements or attributes
>
> Point 1:
> Any element name with no dots in it is treated as HTML (including  
> HTML rules on handing unrecognized elements)

I'd go further and say that for processing purposes, any element with  
dots needs to be treated per HTML rules (where "HTML rules" means the  
HTML5 parsing algorithm).

> Requirement: this solution must allow for distributed creation of  
> globally-unique namespace names (including those outside of a  
> consensus process)

This works if it is a naming convention but the HTML parser & DOM  
don't do any novel processing based on this convention.

(It follows that ASCII letters A to Z get folded into a to z by the  
tokenizer and ASCII letters a to z get folded into A to Z by DOM Level  
1 getters when the owner document has its HTMLness bit set, so you  
can't make com.example.foo and com.example.FOO be distinct.)

> Point 2:
> Any element with one or more dots in it is treated as an extension  
> element.

As long as "treated" is a social thing and not in software operation,  
so far good.

I think syntax-wise this is the best "distributed extensibility"  
proposal I've seen for HTML5. (It's similar to the microdata section  
in HTML5.) Thank you!

> The portion after the last dot is considered the localname, and the  
> the portion up to but not including the last dot is parsed as the  
> pragmatic namespace name (or pname for short). Interfaces with  
> existing namespace-aware APIs must treat the pname as the namespace  
> URI. With the exception noted below, to prevent clashes pnames must  
> be based on reversed DNS names.
>
> Example:
> <head>
>  <title>Document title</title>
>  <com.example.project>
>    <com.example.id>123521123</com.example.id>
>  </com.example.project>
> </head>
>
> In this example document.getElementsByTagName("id") would return the  
> innermost element.
> So would document.getElementsByTagNameNS("com.example", "id")

I think here your proposal goes into the weeds.

The #1 flaw with Namespaces & DOM Level 2 is that the identifiers that  
are fundamental to the operation of software were different from the  
identifiers in plain XML 1.0 or DOM Level 1. Your proposal repeats  
this mistake by making the platform behave radically differently if  
you have a JS program running on a browser that doesn't implement your  
proposal and if you have the same JS program running on a browser that  
implements your proposal.

In your example, the local name of the innermost element MUST be  
"com.example.id" for compatibility with existing behavior. Changing  
what document.getElementsByTagName() returns here is not something  
that's open for discussion. (As in, the probability of a browser  
vendor shipping with the API behavior change is virtually zero.)

The namespace of the innermost element as reported by the DOM isn't  
really open for discussion, either. In an HTML5-compliant UA it is "http://www.w3.org/1999/xhtml 
", because this unifies the DOM with the XHTML5 side, where the  
namespace is constrained by the XHTML legacy to be "http://www.w3.org/1999/xhtml 
". In legacy UAs, the namespace is null.

It would be OK to use the naming convention you propose in markup and  
deliver a helper JS library along you JS application code and let your  
own helper library expand "id" to "com.example.id" before passing it  
to document.getElementsByTagName(). Such a helper library would  
immediately run on past, present and future browsers without needing  
any DOM or parser infrastructural work.

> Requirement: it is highly desirable to produce a document that will  
> produce the same element names in HTML or XML

Agreed. This is basically the DOM Consistency Design principle of HTML5:
http://www.w3.org/TR/html-design-principles/#dom-consistency

> Point 3:
> Zero or more special attributes of the form using.<pname> may appear  
> on the root element, and ONLY on the root element. The declarations  
> have document-wide scope.

Can't have this, because agents implementing your proposal and legacy  
agents would get radically different DOMs.

> Requirement: widely-known namespaces must be parse to an equivalent  
> DOM as xmlns

For practical purposes, the Web platform has four markup languages:  
HTML, SVG, MathML and ARIA. HTML5 already covers the namespace  
assignment of HTML, SVG and MathML. ARIA doesn't need special  
treatment, because it consists entirely of no-namespace attributes.

It's plausible that XBL2 joins the markup language family of the  
platform. However, it's more problematic from the text/html point of  
view. More on that below.

> atom http://www.w3.org/2005/Atom

What's the use case for embedding Atom in text/html?

> docbook http://docbook.org/ns/docbook

Browsers don't support Docbook now. Having syntax for it isn't the  
major part. Supporting all the elements in ways appropriate for their  
semantics would be non-trivial. I think this doesn't belong in HTML5.

> html http://www.w3.org/1999/xhtml

Already covered by HTML5 without new syntax.

> math http://www.w3.org/1998/Math/MathML/

Already covered by HTML5 with syntax that is compatible with copying  
MathML markup from XML and pasting into text/html.

> svg http://www.w3.org/2000/svg

Already covered by HTML5 with syntax that is compatible with copying  
SVG markup from XML and pasting into text/html.

> xbl http://www.mozilla.org/xbl

This is being replaced with XBL2. As far as I'm aware, other vendors  
haven't shown interest in implementing the original Mozilla XBL.

> xbl2 http://www.w3.org/ns/xbl

XBL2 markup can embed XHTML subtrees in rather arbitrary ways. This  
kind of nesting wouldn't work in a backwards-compatible in text/html  
when the nested HTML elements interfere with element within which the  
XBL2 subtree has been embedded. In particular, one would want to put  
the XBL2 subtree inside <head>, but having e.g. <div> as a descendant  
of <head> is not a viable option.

> xforms http://www.w3.org/2002/xforms

XForms hasn't been implemented as a native feature in any of the top 4  
browser engines. Having namespace syntax for XForms for text/html  
would be unlikely to change this. In fact, the whole HTML5 effort got  
started as an alternative to the XForms vision.

However, I'd welcome JS libraries, such as Ubiquity XForms, to  
implement XForms behavior using syntax like <xforms.input>, because  
the dotted syntax results in a consistent DOM in HTML5 and XHTML5  
unlike the colonified syntax.

> xlink http://www.w3.org/1999/xlink

The XLink 1.0 names are already covered in HTML5 when they appear on  
SVG or MathML elements. Generic XLink itself is pretty dead. SVG  
implementations have to implement SVG-specific semantics for the XLink  
names instead of being able to use generic XLink code.

> xml http://www.w3.org/XML/1998/namespace

HTML5 already assigns xml:lang, xml:space and xml:base to the http://www.w3.org/XML/1998/namespace 
  when used on SVG or MathML elements.

xml:id is not supported, because HTML, MathML and SVG all already have  
an id attribute that works just fine. xml:id just adds complexity.

As for HTML elements, there's already the lang attribute and <pre> has  
built-in whitespace significance. xml:base is not supported on HTML  
elements in the text/html serialization.

> Example:
>
> <html using.math="math">...
> <p>
> E.g. <math><msqrt><mi>π</mi></msqrt></math>
> </p>
> ...</html>

This already works in HTML5 without even having to use the using.math  
stuff. I invite you to try it in a trunk nightly build of Firefox  
after you've set the preference html5.enable to true in about:config.

See http://hsivonen.iki.fi/test-html5-parsing/

> In this example document.getElementsByTagName("mi") would return the  
> innermost element.
> So would document.getElementsByTagNameNS("http://www.w3.org/1998/Math/MathML/ 
> ", "mi")

Already works. You can try this with a nightly build of Firefox with  
html5.enable set to true.

> Requirement: must support HTML nested inside an extension vocabulary.
>
> Point 5:
> Unless overridden, HTML documents are treated as if all localnames  
> explicitly listed in the specification are HTML boundary elements.
>
> Example:
> <html using.svg="svg">
>  <body>
>    <svg version="1.1"  viewBox="0 0 100 100"  
> preserveAspectRatio="xMidYMid slice">
>      <rect x="10" y="10" width="100" height="150" fill="gray"/>
>      <foreignObject x="10" y="10" width="100" height="150">
>        <body>
>          <div>Here is a <strong>paragraph</strong>.</div>
>        </body>
>      </foreignObject>
>    </svg>
>  </body>
> </html>
>
> Here the inner body element and its children are still treated as  
> HTML.

Already works in HTML5 without having to use "using.svg". You can try  
this with a nightly build of Firefox with html5.enable set to true.

> Another example:
> <html using.xforms="model select1 range secret">
>  <head>
>    <model>...</model>
>  </head>
>  </body>
>    <xforms.input>...
>  </body>
> </html>
>
> In this case, "input" is already used as an HTML element name, so  
> uses of it--even with the using statement at the top--need to be  
> explicitly spelled out. Of course, the author could have overridden  
> this by including "input" in the using statement, but then any  
> regular HTML input controls would need to be spelled <html.input>.  
> Just like in Java.


This would be highly backwards-incompatible. HTML5 extends HTML forms  
so that the new form features together with JavaScript cover the use  
case space that XForms covers.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Follow-Ups:
- Re: [xml-dev] Pragmatic namespaces
  - From: Micah Dubinko <micah.dubinko@marklogic.com>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]