OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] whitespace in 1.1

[ Lists Home | Date Index | Thread Index ]

In article <20030312234326.1ac58fb1.amyzing@talsever.com> you write:

>> It doesn't look like XML 1.1 changes the S production of XML 1.0, so
>> while NEL is permitted in element and attribute content, it isn't
>> considered whitespace inside of markup components like tags and
>> declarations.

>I asked about this

... and you will shortly be receiving an official response to your
CR comment...

>and was told that it's supposed to be normalized to
>LF before whitespace processing happens.

Right.  Or at least, the result of parsing has to be as if it was done
like this.

>At which point I asked why CR
>was part of the S production,

I don't know why it is for sure.  The original wording of the line-end
normalization section in 1.0 was different (it was changed because it
was subtly inconsistent), and it may have seemed more reasonable at the
time.  For normal use, it makes no difference whether CR is in S or not.

> and was given this hideous hack, using
>parameter entities, that allows one to force an un-normalized CR into
>attribute content.

You can indeed do this - and you only need internal general entities,
not parameter entities - but I doubt that was actually considered as a
reason when XML 1.0 was written.

So, what should 1.1 do?  It could

(1) add NEL (and LSEP) to S, so you could do the same stupid entity hack;
(2) remove CR from S, because no-one really wants it and it's confusing;
(3) leave S as it is.

The trouble with (1) is that it would mean that there would be lots of
places where parsers had to apply different rules for 1.0 and 1.1
documents.  (Unlike the conversion to LF, which only happens in one
place for most parsers.)  This was not considered worth it for the
non-existent advantage of being able to put NELs in internal entities.

I'd be happy with (2), but it has the same problem: if we took CR out
of S in 1.1, parsers would have to reject it for 1.1 documents but
accept it for 1.0 documents.  So again it was not considered worthwhile.

(3) is ugly if you care about such things, but seemed like the least
bad choice.  Probably no-one will ever construct a document where it
makes a difference, except those of us who write test suites.

-- Richard


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS