OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gag me with a blunt …



James Clark wrote:

> The solution appears obvious to me: the EBCDIC encoding table used by
> the XML parser should map byte 0x85 to Unicode character 0xA.

This is certainly a good workaround, but fails for the case of EBCDIC
documents translated to UTF-8 or UTF-16 before being passed to the XML
parser.

When I looked at this about a year ago, it appeared that existing
parsers didn't in fact translate 0x85 to 0xA as part of their Unicode
translation, so they would have to be changed anyway to support EBCDIC
documents.  Given that, it seems that they might as well do it at the
XML line-end normalization level instead, and thus support NEL for
UTF-8 and UTF-16 encoded documents as well as EBCDIC encoded ones.

-- Richard