OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Some comments on the 1.1 draft

[ Lists Home | Date Index | Thread Index ]

From: "John Cowan" <cowan@mercury.ccil.org>
> Rick Jelliffe scripsit:
> > Even if you only use ISO 8859-1, it is still important. The Euro=0x80
> > mistake will be increasingly common, and we need to make sure that
> > XML processors continue to catch this error.
> But they don't!
> Characters U+0080 through U+009F are legal XML content.
> You are talking about a "defense" that doesn't even exist.
No. 0x0085 is not AFAIK a character in ISO 8859-1 (it is one of the design principles
of 8859-1 that it will not fail on systems that mask the 8th bit and look for control
characters).   So a document labelled as ISO 8859-1 but with an 0x85 false Euro
should fail on import. The 85 character not existing in 8859-1, it never gets as far
as Unicode.

MSXML 4 gets this right, and gives an error at those times. I have had a support
request on this with our validator, so I had to look into it.

The defense does exist.

Rick Jelliffe


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS