"Michael Kay" <firstname.lastname@example.org> wrote on 03/18/2010 03:01:53 PM:
>> It's not well-formed.
>> From the XML 1.0 spec :
>> "It is a fatal error if an XML entity is determined (via default,
>> encoding declaration, or higher-level protocol) to be in a certain
>> encoding but contains byte sequences that are not legal in that encoding."
> Unless of course there is a "higher-level protocol" that tells you
> it's really a different encoding. (The term higher-level protocol is
> not really defined. I think they had in mind the media-type from the
> HTTP content header.
Right or through other methods like InputSource.setEncoding() in the SAX API. I was assuming (for Roger) that it was being determined by the encoding declaration when I gave my overly simplistic answer.
> In terms of the protocol stack, that of course
> is a lower-level protocol. But it's sufficiently woolly that a phone
> call from the sender to say "Oops, I meant EBCDIC" would be enough
> to make the document well-formed.
> Michael Kay
XML Parser Development
IBM Toronto Lab