OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] version numbers and infosets

[ Lists Home | Date Index | Thread Index ]

At 3:17 PM +0100 7/23/02, Richard Tobin wrote:
>>You want to add a fourth level, not well-formed but OK.
>This happens all the time anyway.  Parsers that don't read all
>external entities accept not-well-formed documents when the error
>occurs in an entity they don't read.

That's a separate issue, and the problems it causes today for 
interoperability, should be more than enough to convince us we don't 
want to exacerbate these.

>>I think this is a very bad idea. Leave the existing spec
>>alone: no erratum, no change. XML 1.0 is defined by the XML 1.0 spec
>>as originally published. Make any changes in future versions if you
>>really must, but don't touch XML 1.0.
>There are two problems with that:
>  - there will be documents that are both well-formed XML 1.0 documents
>    and well-formed XML 1.1 documents, and they will have different
>    infosets depending on which you consider them as (eg a document with
>    a NEL in a tokenized attribute)

My preferred fix is simply to ditch XML 1.1 completely. Do not issue 
it. Then this problem goes away.

A slightly less radical fix is to ditch NEL from white space, while 
still allowing the other changes in XML 1.1. This would fix the 
conflicting infoset problem nicely.

>  - there will be documents labelled 1.1 which are well-formed XML 1.0
>    but not well-formed XML 1.1 (eg a document with unnormalized unicode).

Really?  Is the working group seriously considering requiring Unicode 
normalization in XML 1.1? OK, I just looked in the spec, and I see it 
is there. That's so nasty Imust have blocked it out of my memory. Let 
me go on record as stating that this is a very bad idea. It makes 
life significantly more difficult for implementers and users, 
especially in non-English text, exactly the community XML 1.1 is 
allegedly being developed to support.

>You say if "complexifies" the XML story, but it seems much simpler to
>me for a document labelled 1.0 to be only a well-formed 1.0 document
>and a document labelled 1.1 to be only a well-formed 1.1 document.

If you really want this, fix it in 1.1. Add something to XML 1.1 that 
is clearly illegal in XML 1.0 rather than pretending you can change 
XML 1.0 ex post facto. For example, you could replace the version 
attribute in the XML declaration with an edition attribute and make 
the XML declaration required. Thus all well-formed XML 1.1 documents 
would be malformed XML 1.0 documents and vice versa.  For example,

<?xml edition="1.0"?>

The coup that led to using an erratum to change a clear decision of 
the namespaces working group a couple of years later was a horrible 
precedent. It severely shook the belief of many XML developers in the 
stability of W3C specs.
The Schema group's change to the gMonth types was similarly bad. This 
one's even worse.

That this is even being considered seriously proves that developers 
cannot rely on W3C stare decisis. Undoubtedly mistakes will be made 
in specs, and in hindsight we'll see lots of things we would have 
done differently had we known then what we know now. Nonetheless, it 
is more important to remain stable than to fix the problems. If new 
versions are required, they should be genuine new versions, not 
stealth changes in existing specs.

| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
|          XML in a  Nutshell, 2nd Edition (O'Reilly, 2002)          |
|              http://www.cafeconleche.org/books/xian2/              |
|  http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/  |
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS