[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Pragmatic namespaces
- From: Henri Sivonen <hsivonen@iki.fi>
- To: Micah Dubinko <Micah.Dubinko@marklogic.com>
- Date: Mon, 24 Aug 2009 14:03:31 +0300
On Aug 24, 2009, at 01:05, Micah Dubinko wrote:
> On Aug 13, 2009, at 11:47 AM, Henri Sivonen wrote:
>
>>> Example:
>>> <head>
>>> <title>Document title</title>
>>> <com.example.project>
>>> <com.example.id>123521123</com.example.id>
>>> </com.example.project>
>>> </head>
>>>
>>> In this example document.getElementsByTagName("id") would return the
>>> innermost element.
>>> So would document.getElementsByTagNameNS("com.example", "id")
>>
>> I think here your proposal goes into the weeds.
>>
>> The #1 flaw with Namespaces & DOM Level 2 is that the identifiers
>> that
>> are fundamental to the operation of software were different from the
>> identifiers in plain XML 1.0 or DOM Level 1. Your proposal repeats
>> this mistake by making the platform behave radically differently if
>> you have a JS program running on a browser that doesn't implement
>> your
>> proposal and if you have the same JS program running on a browser
>> that
>> implements your proposal.
>
> It's already the case that older browsers will interpret things
> differently. Old browsers won't treat <svg> as something in the SVG
> namespace, but newer ones will.
I thought the point of "extensions" was that they are mainly hooks for
scripts. As such, they could work on existing browsers, too, if they
were a mere naming convention. In contrast, SVG requires quite
distinctive native 2D rendering support that can't be achieved with
mere scripting and styling macros on top of the HTML4+CSS functionality.
As for unilateralist browser-sensitive extensions like <blink> and
<marquee>, it would probably have been better to make them less
attractive than the elements minted through a peer review. Thus, it
would have been better to make them <com.netscape.blink> and
<com.microsoft.marquee> without a mechanism to hide the prefixes.
> Since there are (presumably) far fewer HTML documents with multiple
> dots in element names than with <svg> elements, one could argue that
> this doesn't cause significant backwards compatibility problems
> either.
Actually, SVG-in-text/html parsing takes special steps to deal with
legacy content that contains SVG bit due to cargo cult copying and
pasting. Specifically, the parser breaks out of SVG at the slightest
hint of cargo cult copying and pasting:
http://www.whatwg.org/specs/web-apps/current-work/#parsing-main-inforeign
(The cases that say: 'A start tag whose tag name is one of: "b",
"big", "blockquote", "body", "br", "center", "code", "dd", "div",
"dl", "dt", "em", "embed", "h1", "h2", "h3", "h4", "h5", "h6", "head",
"hr", "i", "img", "li", "listing", "menu", "meta", "nobr", "ol", "p",
"pre", "ruby", "s", "small", "span", "strong", "strike", "sub", "sup",
"table", "tt", "u", "ul", "var" A start tag whose tag name is "font",
if the token has any attributes named "color", "face", or "size"')
> By its very nature, recent HTML standardization work has been about
> getting browsers to change how their parsers operate. I can flip a
> switch in my Firefox and enable HTML5 mode, which has slight
> differences in how my browser would otherwise work.
Except for the SVG and MathML stuff, the changed behaviors (compared
to the old HTML parser in Gecko) fall mainly into one of two buckets:
1) Changes from old Gecko behavior to align with IE or WebKit behavior
2) Changes from old behavior of any browser in order to make the
parser never read back from the DOM in order to allow the DOM and the
HTML parser to live in different threads
The SVG and MathML changes are sometimes generalized to mean that
browsers are now doing HTML extensions or namespaces. That's not
what's happening. The SVG and MathML support is about taking the
browser-native functionality investment that has already been made but
that has been tied to XML parsing and enabling it in the text/html
world.
It is incorrect to extrapolate that it's now OK to use Namespaces in
text/html for purposes other than salvaging investments in
functionality previously implemented but tied to XML.
> The other issue that came up in discussions is what the proper scope
> of a proposal like this should be. Does it affect only the HTML
> syntax rules, or could something be dreamt up that would supplant/
> replace xmlns in XML documents as well? That's a wide open issue
> still.
I think it's a problem in terms of the DOM Consistency design
principle if the same syntax doesn't produce the same DOM on both
sides of the fence. The simplest way to make progress is to stick to
things that don't require any changes to the XML 1.0 4th ed. +
Namespaces 1.0 layers on the XML side and that don't trigger any
Namespaces layer processing (i.e. don't use the colon).
Using reverse DNS identifiers as a naming convention without affecting
how the DOM/Infoset is constructed meets these requirements.
>> In your example, the local name of the innermost element MUST be
>> "com.example.id" for compatibility with existing behavior.
>
> Based on your following comment, it's not clear if you mean existing
> behavior of parsers or of the DOM API...
Parsers and DOM Level 2 are rather heavily coupled. I meant in terms
of the DOM API, in terms of the Infoset, in terms of the XPath data
model, in terms of Selectors and in terms of the browser-internal APIs.
>> Changing
>> what document.getElementsByTagName() returns here is not something
>> that's open for discussion. (As in, the probability of a browser
>> vendor shipping with the API behavior change is virtually zero.)
>
> Right, I wouldn't expect any DOM functions to change w.r.t.
> returning already-parsed DOM information.
OK. Your proposal specifically suggested changes to what specific DOM
methods return, though.
>> The namespace of the innermost element as reported by the DOM isn't
>> really open for discussion, either. In an HTML5-compliant UA it is "http://www.w3.org/1999/xhtml
>> ", because this unifies the DOM with the XHTML5 side, where the
>> namespace is constrained by the XHTML legacy to be "http://www.w3.org/1999/xhtml
>> ". In legacy UAs, the namespace is null.
>
> There are already special cases for SVG, MathML, etc., and already
> differences with legacy browsers. Is one more class of special cases
> beyond consideration? If so, why?
SVG and MathML are the only vocabularies that have already been
implemented in multiple browsers and that need bringing into the text/
html world. New things can simply be added to the http://www.w3.org/1999/xhtml
namespace. That's what was done with <video>, for example.
I think it is a bug that the organization of the W3C into Working
Groups leaks to Web authors in the form of multiple namespaces.
(Consider Conway's Law.) Changing the namespace of SVG and MathML or
withdrawing Namespaces from the XML stack would be too late at this
point, so the namespaces for SVG and MathML need to be grandfathered
in, but there's no reason to keep minting more namespaces.
>>> Requirement: widely-known namespaces must be parse to an equivalent
>>> DOM as xmlns
>
> Think of this: it's entirely possible that the arrival of a
> distributed extensibility mechanism in HTML (not just XHTML) might
> forever change some respects of how HTML gets written, and how XML
> vocabularies are defined.
>
> For example, say Tim's proposal mentioned earlier takes off. Then 1)
> many more HTML documents will be around using one-off <x.y.z>
> element names, and 2) future XML vocabularies might use element
> names like <foo.bar.baz> (possibly without namespaces) so that they
> could be readily used in HTML.
This is possible. (The XML vocabularies that wish to use dotted names
and be mixed readily into text/html should probably use the http://www.w3.org/1999/xhtml
namespace, though.)
> But even if this happens, there still exists *some* list of older
> namespaces/vocabularies that people will want to use in HTML. I
> don't have a strong opinion of what that list might be, so I put
> together some typical examples in the initial proposal in an attempt
> to smooth over the inevitable transition process.
I think it's pretty simple to form the list of privileged namespaces:
Take the set of namespaces supported by at least two browsers out of
IE, Firefox, Safari and Opera in subtrees of application/xhtml+xml
documents.
>>> Example:
>>>
>>> <html using.math="math">...
>>> <p>
>>> E.g. <math><msqrt><mi>π</mi></msqrt></math>
>>> </p>
>>> ...</html>
>>
>> This already works in HTML5 without even having to use the using.math
>> stuff. I invite you to try it in a trunk nightly build of Firefox
>> after you've set the preference html5.enable to true in about:config.
>
> What if these namespace assignments could happen in a less magical
> fashion?
Why shouldn't SVG and MathML work just as easily as HTML? What benefit
is there for Web authors for having to use incantations like xmlns or
using.math when it has now been shown by implementation this stuff can
work without such incantations? I think the HTML5 way of incorporating
SVG and MathML is less magic in the sense that there are no spells for
the author to cast.
--
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]