XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Strict-Mode and Lax-Mode MicroXML

On Sun, 2012-06-03 at 10:00 +0100, Pete Cordell wrote:

> Note that this wouldn't be Postel lax processing because the laxness 
> wouldn't be left up to the parser implementers to define.  It would be part 
> of the spec.  (Or is that too HTML5-ish!). 

Jon Postel's "Law" - be liberal in what you accept and strict in what
you put out - when applied to hTML 5 (for example) would say that Web
servers are in violation of the relevant specs (HTTP, MIME, HTML) when
they emit content labelled as HTML that is not valid, but that Web
browsers should not be overly strict in processing what they receive.

The tradeoff is in whether the receiver can determine appropriate
behaviour for out-of-band input.

For HTML it's "can I display something, even if it's now what the author
intended."

For XML it's "can I use this data correctly even though the input
contains syntax errors" - this is more akin to TCP/IP (and closer to the
original domain of Postel's law) where a network packet with a bad
checksum is rejected and retransmission is requested.

So,

>     If a < character is not followed by a nameStartChar or a ? character
>     or a ! character then it should be treated as a < character that has no
>     special meaning.
or / of course :-)

>     If a & character is not followed by one of the character sequences
>     gt; or lt; or amp; or quot; or apos; then it should be treated as a &
>     character that has no special meaning.

or # presumably.

If I were designing XML (or µ-XML) from scratch I'd probably just want
an escape character so that I could write \< or \& or <a b="\""> or
whatever.

Optimising for hand-authored documents is a mistake - make hand
authoring easy, but not at the expense of harder machine processing.

An example was the CDATA section, included in XML because the spec
authors wanted it for examples despite the fact it made parsing
irregular. Better might have been to include a CDATA element, e.g. by
saying element names "starting with %" (in SGML terms) were literal,
<%foo> ...</%foo>.

> I see this as a migration strategy to get away from some of the SGML baggage 
> that is no longer relevant, and maybe in 10 years time we can safely adopt 
> lax-mode for 99% of what developers want to do and have -- in comments etc.

There's no acceptable value of "10 years" for breaking changes.

XML today is used in consumer devices, in computer boot sequences, in
aircraft and car engines, it's not something that can change; µXML, if
successful and in use a decade from now, would be in a similar
situation.

Best,

Liam

-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS