OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: How to specify a Processing Instruction? (better: how tocontrolencoding on saving)

> From: ComCity [mailto:mikeb@comcity.com]
> Sent: Thursday, August 30, 2001 12:00 AM
> To: Julian Reschke; xml-dev@lists.xml.org
> Subject: Re: How to specify a Processing Instruction? (better: how to
> controlencoding on saving)
> ...
> > Then obviously something is wrong on *your* side. Or do you
> happen to use
> > loadXML()? In which case I recommend to read the SDK
> documentation and the
> > MSDN article about encodings
> >
> (<http://msdn.microsoft.com/library/default.asp?url=/library/en-us
> /dnxml/htm
> > l/xmlencodings.asp>).
> Yes, I do use loadXML and not load.  I have read this article
> about 10 times
> already.  However, reading again after your and others posts has

Maybe you should read it again and pay attention to:

"The LoadXML method always takes a Unicode BSTR that is encoded in UCS-2 or
UTF-16 only. If you pass in anything other than a valid Unicode BSTR to
LoadXML, it will fail to load."

> made this a
> little clearer to me.  It appears I'm reading an existing XML
> document with
> LoadXML.  That document is thus read in by default as UTF-8
> because there is
> no encoding declaration.  Then I receive an XML response from a second

No. MSXML assumes UTF-16 because that's the encoding used in COM strings.

> party.  I try and move one node from that XML document into my
> new one where
> that node is actually in ISO-8859-1.  It automatically converts the

Nodes aren't in any other encoding than the one used by the DOM (here:

> ISO-8859-1 node to UTF-8 otherwise it wouldn't be able to stick it in the
> XML document in the first place and the whole thing quits working.

Please make sure you understand the difference between a DOM (an in-memory
object-model where strings are always encoded in UTF-16) and a particular
byte serialization (the XML "file").