OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gag me with a blunt …



From: Susan Malaika <malaika@us.ibm.com>

>> Well, zenkaku(ideographic) space is kind of the same. I have
>> fought with it in the past and lost face, if nothing else ;-)
>
>Without any change to XML parsers, that leaves zenkaku and OS390 users
> with one option as far as I can see:
>      To cleanup zenkaku spaces and OS390 [NEL]s before handing XML
>      documents to parsers

[[[[While I do agree with Gavin that visual whitespace should be all
separator characters in XML, the particular zenkaku (East Asian double width
characters called "fullwidth" since East Asians compare them to ideographs
as the standard  size) space issue has quite a good counter argument that if
people use full-width spaces they will use full-width alphabetics too and
because XML does not provide folding there will be more non-well-formed
documents.]]]]

There is another relevant question: what is the use of the characters that
has a different semantic than XML's #xA?

In other words, are NEL like TABs: archaisms that don't fit well with XML
(or SGML in the absense of short-references).  XML was not designed as a
round-tripping format for binary data.  If lines are significant, perhaps
they need to be serialized as
 <line>....</line>
if you catch my drift.

Cheers
Rick Jelliffe