[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: XML5: Re: [xml-dev] MicroXML
- From: David Carlisle <davidc@nag.co.uk>
- To: Henri Sivonen <hsivonen@iki.fi>
- Date: Thu, 16 Dec 2010 09:30:58 +0000
On 16/12/2010 04:32, Henri Sivonen wrote:
> On Dec 14, 2010, at 05:17, David Carlisle wrote:
>
>> I've no complaint with html5 having defined fixup rules to give
>> consistent error recovery from overlapping markup and other
>> horrors, but I think the fact that it parses well formed XML and
>> produces different trees is just wrong.
>
> The easiest proof why it has to be this way is: There are Web pages
> that rely on the<html> tag getting implied per HTML 4 when not
> present in the source text. Therefore, the HTML5 parsing algorithm
> always outputs a tree whose root element is html. There are XML
> documents whose root element is not html. Therefore, it has to be
> that there are well-formed XML documents that parse into different
> trees using an XML parser and an HTML parser.
Yes I nearly mentioned those cases as an exception:-) But you give the
example that's almost reasonable (html/head/body/tbody implication)
while not responding to the cases that actually cause the problems as
they affect the parsing of arbitrarily small fragments, namely /> and
the different handling of end tags for individual void elements.
It would have been possible to also stop implying html start tags if you
had been prepared to have a "more standards mode" implied by (say)
<!doctype html>
there were reasons for not doing that, but it's a choice made, not an
absolute rule that it would have been impossible to have a sensible
grammar for html.
David
________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs.
________________________________________________________________________
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]