XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Never mind the browser, let's do MicroXML

On 18/12/2010 00:45, Kurt Cagle wrote:
> Interesting (and thanks for the civil reply - I've rather been making a
> stink of myself on this lately).
>
> What I sense that you're saying is that while the parser will attempt to
> parse anything thrown at it, there is still a core set of parse rules
> that are independent of the underlying semantics of the language. Put
> another way, there is a set of well-formedness rules, but the role of
> the parser is to provide a guess, based upon its internal heuristics, as
> to which particular rules apply when it encounters non-well-formed
> content in order to turn it into well-formed content prior to rendering
> it. Or, to state it yet another way, if a creator knows the heuristics
> they could encode any content ... just that there are specific use cases
> in XML that would create a different parse tree in HTML5. Would you say
> this is correct?
>
> Kurt Cagle
> XML Architect
> /Lockheed / US National Archives ERA Project/

I don't think you can (you at least should) use words like guess and 
heuristic to a process that is entirely mechanical and deterministic.

html5 isn't an extensible meta language like sgml or xml it has a fixed 
set of element names and any use of any other name is non conforming 
(which is the closest analog to xml or sgml concept of validity). The 
difference however with xml or sgml is that in the non conforming case 
it doesn't just declare the input as out of scope "not well formed". It 
defines for _every_ input a parse tree. Essentially conformance rules 
are just defined as applying to authors (and authoring systems) an html5 
processor has a defined behaviour on any old rubbish.

<aaa<bbb</zzz>

has a defined parse tree, I don't actually know what it is, but FF4 will 
tell me...

If I read that right it parses as an element with name aaa<bbb< and a 
singe attribute with name zzz with value "".
I may have read that wrong (it's late) but it doesn't really matter the 
point it has some fixed parse tree.

You can not create any valid xml xml tree as conforming html5 as it 
doesn't conform as soon as you use a non html/mathml/svg element name, 
however so long as you avoid those names, you can determisitically 
produce input that will parse to give essentially the same tree 
structure as xml without namespaces (basically just avoid using /> syntax.

David










[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS