Lists Home |
Date Index |
From: "John Cowan" <firstname.lastname@example.org>
> Rick Jelliffe scripsit:
> > Even if you only use ISO 8859-1, it is still important. The Euro=0x80
> > mistake will be increasingly common, and we need to make sure that
> > XML processors continue to catch this error.
> But they don't!
> Characters U+0080 through U+009F are legal XML content.
> You are talking about a "defense" that doesn't even exist.
No. 0x0085 is not AFAIK a character in ISO 8859-1 (it is one of the design principles
of 8859-1 that it will not fail on systems that mask the 8th bit and look for control
characters). So a document labelled as ISO 8859-1 but with an 0x85 false Euro
should fail on import. The 85 character not existing in 8859-1, it never gets as far
MSXML 4 gets this right, and gives an error at those times. I have had a support
request on this with our validator, so I had to look into it.
The defense does exist.