Strict-Mode and Lax-Mode MicroXML

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

From: "Pete Cordell" <petexmldev@codalogic.com>
To: <xml-dev@lists.xml.org>
Date: Sun, 3 Jun 2012 10:00:08 +0100

This will set the cat among the pigeons...

Did we discuss the option of having Strict-Mode and Lax-Mode for MicroXML?

Strict-Mode would be what John has documented.  Lax-Mode could be some 
defined set of error recoveries such as & not followed by amp;, lt; etc. is 
treated as an &.

Note that this wouldn't be Postel lax processing because the laxness 
wouldn't be left up to the parser implementers to define.  It would be part 
of the spec.  (Or is that too HTML5-ish!).  For example you might have:

    2.4.1 Character data - Lax-Mode

    If a < character is not followed by a nameStartChar or a ? character
    or a ! character then it should be treated as a < character that has no
    special meaning.

    If a & character is not followed by one of the character sequences
    gt; or lt; or amp; or quot; or apos; then it should be treated as a &
    character that has no special meaning.

A MicroXML parser would then return a status of either Strict-Well-Formed, 
Lax-Well-Formed, or Not-Well-Formed.  Perhaps when you called the parser you 
would also be able to specify that you want strict-mode.

Developers would be encouraged to generate documents that are 
Strict-Well-Formed, but parse them as Lax-Well-Formed.

I see this as a migration strategy to get away from some of the SGML baggage 
that is no longer relevant, and maybe in 10 years time we can safely adopt 
lax-mode for 99% of what developers want to do and have -- in comments etc.

Pete Cordell
Codalogic Ltd
Interface XML to C++ the easy way using C++ XML
data binding to convert XSD schemas to C++ classes.
Visit http://codalogic.com/lmx/ or http://www.xml2cpp.com
for more info

Follow-Ups:
- Re: [xml-dev] Strict-Mode and Lax-Mode MicroXML
  - From: Liam R E Quin <liam@w3.org>
- Re: [xml-dev] Strict-Mode and Lax-Mode MicroXML
  - From: John Cowan <cowan@mercury.ccil.org>
- Re: [xml-dev] Strict-Mode and Lax-Mode MicroXML
  - From: Peter Flynn <peter@silmaril.ie>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]