OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] MicroXML

On Dec 13, 2010, at 01:01, James Clark wrote:

>    http://blog.jclark.com/2010/12/microxml.html

> 	• MicroXML - by this I mean a subset of XML 1.0 that is not intended to replace XML 1.0, but is intended for contexts where XML 1.0 is, or is perceived as, too heavyweight.

Who would implement MicroXML instead of implementing XML 1.0? That is, what problem is being solved and for whom?

It's hard to evaluate the idea without knowing its intended goals.

> For example, IE doesn't support XHTML;

This is no longer true as of IE9.

> Mozilla doesn't incrementally render XHTML.

I fixed this back in 2006.

> HTML5 makes it possible to have "polyglot" documents that are simultaneously well-formed XML and valid HTML5.  I think this is potentially a superb format for documents: it's rich enough to represent a wide range of documents, it's much simpler than full HTML5, and it can be processed using XML tools.

I think making the author jump through hoops in order for the consumer to be able to use XML tools is the wrong solution. I think the right solution is that the consumer uses an HTML5 parser that exposes an XML infoset instead of using an XML parser at the start of the pipeline.

The HTML parser I've written exposes the SAX, DOM and XOM APIs so the application code can be written in the same way it would be written when using XML parser parsing XHTML documents. Unlike the polyglot approach, this solution works without the content author's participation.

> It would be great if HTML5 provided an alternate way (using attributes or elements) to declare that an HTML document be parsed in standards mode. Perhaps a boolean "standard" attribute on the <meta> element?

That would fail to enable the standards mode in browsers that are already out there, so I can say with confidence that HTML5 isn't going to change like this.

> I believe MicroXML should not impose any specific error handling policy;

This is sure recipe for an interoperability failure. Well-specified behavior in error situations at least leads to interop even if the results are nonsensical at times. I think the right way to spec any successor of XML is to specify a normative tokenizer state machine in such a way that in every state, any possible input character always has a well-defined transition (like the HTML5 tokenizer has).

> 	• Namespaces. This is probably the hardest and most controversial issue. I think the right answer is to take a deep breath and just say no. One big reason is that the HTML5 does not support namespaces (remember, I am talking about the HTML syntax of HTML5).

HTML5 doesn't support Namespace declarations in the text/html syntax. However, the data model HTML5 uses for the document tree has Namespaces. (The reason for this is reusing the XHTML, MathML and SVG code that had already been written to operate on a Namespaced data model.)

The parsing algorithm can't output trees with arbitrary namespaces. In the parser's output, elements are always in one of these namespaces: http://www.w3.org/1999/xhtml, http://www.w3.org/1998/Math/MathML and http://www.w3.org/2000/svg. Attributes on elements that are in the http://www.w3.org/1999/xhtml namespace are always in no namespace. Attributes on other elements can be in no namespace, in the http://www.w3.org/1999/xlink namespace, in the http://www.w3.org/XML/1998/namespace namespace or in the http://www.w3.org/2000/xmlns/ namespace. There's a finite set of attributes that can be in a namespace other than no namespace.

> 		• I would support the use of the xmlns attribute (not xmlns:x, just bare xmlns). However, as far as the MicroXML data model is concerned, it's just another attribute. It thus works in a very similar way to xml:lang: it would be allowed only where a schema language explicitly permits it; semantically it works as an inherited attribute; it does not magically change the names of elements.

If the data model doesn't support namespaces, how would one distinguish {http://www.w3.org/1999/xhtml}a and {http://www.w3.org/2000/svg}a in code path does things with the data model?

> 		• An element probably also needs to have a flag saying whether it's an empty element. This is unfortunate but HTML5 does not treat an empty element as equivalent to a start-tag immediately followed by an end-tag: elements like <br> cannot have end-tag, and elements that can have content such as <a> cannot use the empty element syntax even if they happen to be empty. (It would be really nice if this could be fixed in HTML5.)

It can't due to existing content.

Henri Sivonen

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS