[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Pragmatic namespaces
- From: Henri Sivonen <hsivonen@iki.fi>
- To: XML Developers List <xml-dev@lists.xml.org>
- Date: Thu, 13 Aug 2009 21:47:39 +0300
On Aug 1, 2009, at 02:06, Micah Dubinko wrote:
> Literally for years, people have been talking about how great it
> would be to use something like Java-style namespaces in XML instead
> of the current xmlns regime. For example <http://www.xml.com/pub/a/2005/04/13/namespace-uris.html
> > .
I'm glad to see that here over in the XML land, people who've worked
with Namespaces show appropriate discontent with them. I wish the RDFa
land took note.
> Requirement: this solution must not interfere with existing HTML
> elements or attributes
>
> Point 1:
> Any element name with no dots in it is treated as HTML (including
> HTML rules on handing unrecognized elements)
I'd go further and say that for processing purposes, any element with
dots needs to be treated per HTML rules (where "HTML rules" means the
HTML5 parsing algorithm).
> Requirement: this solution must allow for distributed creation of
> globally-unique namespace names (including those outside of a
> consensus process)
This works if it is a naming convention but the HTML parser & DOM
don't do any novel processing based on this convention.
(It follows that ASCII letters A to Z get folded into a to z by the
tokenizer and ASCII letters a to z get folded into A to Z by DOM Level
1 getters when the owner document has its HTMLness bit set, so you
can't make com.example.foo and com.example.FOO be distinct.)
> Point 2:
> Any element with one or more dots in it is treated as an extension
> element.
As long as "treated" is a social thing and not in software operation,
so far good.
I think syntax-wise this is the best "distributed extensibility"
proposal I've seen for HTML5. (It's similar to the microdata section
in HTML5.) Thank you!
> The portion after the last dot is considered the localname, and the
> the portion up to but not including the last dot is parsed as the
> pragmatic namespace name (or pname for short). Interfaces with
> existing namespace-aware APIs must treat the pname as the namespace
> URI. With the exception noted below, to prevent clashes pnames must
> be based on reversed DNS names.
>
> Example:
> <head>
> <title>Document title</title>
> <com.example.project>
> <com.example.id>123521123</com.example.id>
> </com.example.project>
> </head>
>
> In this example document.getElementsByTagName("id") would return the
> innermost element.
> So would document.getElementsByTagNameNS("com.example", "id")
I think here your proposal goes into the weeds.
The #1 flaw with Namespaces & DOM Level 2 is that the identifiers that
are fundamental to the operation of software were different from the
identifiers in plain XML 1.0 or DOM Level 1. Your proposal repeats
this mistake by making the platform behave radically differently if
you have a JS program running on a browser that doesn't implement your
proposal and if you have the same JS program running on a browser that
implements your proposal.
In your example, the local name of the innermost element MUST be
"com.example.id" for compatibility with existing behavior. Changing
what document.getElementsByTagName() returns here is not something
that's open for discussion. (As in, the probability of a browser
vendor shipping with the API behavior change is virtually zero.)
The namespace of the innermost element as reported by the DOM isn't
really open for discussion, either. In an HTML5-compliant UA it is "http://www.w3.org/1999/xhtml
", because this unifies the DOM with the XHTML5 side, where the
namespace is constrained by the XHTML legacy to be "http://www.w3.org/1999/xhtml
". In legacy UAs, the namespace is null.
It would be OK to use the naming convention you propose in markup and
deliver a helper JS library along you JS application code and let your
own helper library expand "id" to "com.example.id" before passing it
to document.getElementsByTagName(). Such a helper library would
immediately run on past, present and future browsers without needing
any DOM or parser infrastructural work.
> Requirement: it is highly desirable to produce a document that will
> produce the same element names in HTML or XML
Agreed. This is basically the DOM Consistency Design principle of HTML5:
http://www.w3.org/TR/html-design-principles/#dom-consistency
> Point 3:
> Zero or more special attributes of the form using.<pname> may appear
> on the root element, and ONLY on the root element. The declarations
> have document-wide scope.
Can't have this, because agents implementing your proposal and legacy
agents would get radically different DOMs.
> Requirement: widely-known namespaces must be parse to an equivalent
> DOM as xmlns
For practical purposes, the Web platform has four markup languages:
HTML, SVG, MathML and ARIA. HTML5 already covers the namespace
assignment of HTML, SVG and MathML. ARIA doesn't need special
treatment, because it consists entirely of no-namespace attributes.
It's plausible that XBL2 joins the markup language family of the
platform. However, it's more problematic from the text/html point of
view. More on that below.
> atom http://www.w3.org/2005/Atom
What's the use case for embedding Atom in text/html?
> docbook http://docbook.org/ns/docbook
Browsers don't support Docbook now. Having syntax for it isn't the
major part. Supporting all the elements in ways appropriate for their
semantics would be non-trivial. I think this doesn't belong in HTML5.
> html http://www.w3.org/1999/xhtml
Already covered by HTML5 without new syntax.
> math http://www.w3.org/1998/Math/MathML/
Already covered by HTML5 with syntax that is compatible with copying
MathML markup from XML and pasting into text/html.
> svg http://www.w3.org/2000/svg
Already covered by HTML5 with syntax that is compatible with copying
SVG markup from XML and pasting into text/html.
> xbl http://www.mozilla.org/xbl
This is being replaced with XBL2. As far as I'm aware, other vendors
haven't shown interest in implementing the original Mozilla XBL.
> xbl2 http://www.w3.org/ns/xbl
XBL2 markup can embed XHTML subtrees in rather arbitrary ways. This
kind of nesting wouldn't work in a backwards-compatible in text/html
when the nested HTML elements interfere with element within which the
XBL2 subtree has been embedded. In particular, one would want to put
the XBL2 subtree inside <head>, but having e.g. <div> as a descendant
of <head> is not a viable option.
> xforms http://www.w3.org/2002/xforms
XForms hasn't been implemented as a native feature in any of the top 4
browser engines. Having namespace syntax for XForms for text/html
would be unlikely to change this. In fact, the whole HTML5 effort got
started as an alternative to the XForms vision.
However, I'd welcome JS libraries, such as Ubiquity XForms, to
implement XForms behavior using syntax like <xforms.input>, because
the dotted syntax results in a consistent DOM in HTML5 and XHTML5
unlike the colonified syntax.
> xlink http://www.w3.org/1999/xlink
The XLink 1.0 names are already covered in HTML5 when they appear on
SVG or MathML elements. Generic XLink itself is pretty dead. SVG
implementations have to implement SVG-specific semantics for the XLink
names instead of being able to use generic XLink code.
> xml http://www.w3.org/XML/1998/namespace
HTML5 already assigns xml:lang, xml:space and xml:base to the http://www.w3.org/XML/1998/namespace
when used on SVG or MathML elements.
xml:id is not supported, because HTML, MathML and SVG all already have
an id attribute that works just fine. xml:id just adds complexity.
As for HTML elements, there's already the lang attribute and <pre> has
built-in whitespace significance. xml:base is not supported on HTML
elements in the text/html serialization.
> Example:
>
> <html using.math="math">...
> <p>
> E.g. <math><msqrt><mi>π</mi></msqrt></math>
> </p>
> ...</html>
This already works in HTML5 without even having to use the using.math
stuff. I invite you to try it in a trunk nightly build of Firefox
after you've set the preference html5.enable to true in about:config.
See http://hsivonen.iki.fi/test-html5-parsing/
> In this example document.getElementsByTagName("mi") would return the
> innermost element.
> So would document.getElementsByTagNameNS("http://www.w3.org/1998/Math/MathML/
> ", "mi")
Already works. You can try this with a nightly build of Firefox with
html5.enable set to true.
> Requirement: must support HTML nested inside an extension vocabulary.
>
> Point 5:
> Unless overridden, HTML documents are treated as if all localnames
> explicitly listed in the specification are HTML boundary elements.
>
> Example:
> <html using.svg="svg">
> <body>
> <svg version="1.1" viewBox="0 0 100 100"
> preserveAspectRatio="xMidYMid slice">
> <rect x="10" y="10" width="100" height="150" fill="gray"/>
> <foreignObject x="10" y="10" width="100" height="150">
> <body>
> <div>Here is a <strong>paragraph</strong>.</div>
> </body>
> </foreignObject>
> </svg>
> </body>
> </html>
>
> Here the inner body element and its children are still treated as
> HTML.
Already works in HTML5 without having to use "using.svg". You can try
this with a nightly build of Firefox with html5.enable set to true.
> Another example:
> <html using.xforms="model select1 range secret">
> <head>
> <model>...</model>
> </head>
> </body>
> <xforms.input>...
> </body>
> </html>
>
> In this case, "input" is already used as an HTML element name, so
> uses of it--even with the using statement at the top--need to be
> explicitly spelled out. Of course, the author could have overridden
> this by including "input" in the using statement, but then any
> regular HTML input controls would need to be spelled <html.input>.
> Just like in Java.
This would be highly backwards-incompatible. HTML5 extends HTML forms
so that the new form features together with JavaScript cover the use
case space that XForms covers.
--
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]