[
Lists Home |
Date Index |
Thread Index
]
- To: <xml-dev@lists.xml.org>
- Subject: utf8 -> utf16 conversion problem ?
- From: "Alex Ben-Ari" <alexb@mercadosw.com>
- Date: Wed, 12 Feb 2003 11:34:27 +0200
- Thread-index: AcLSee8HXrEMtGc6S4KoiISNgEh0ng==
- Thread-topic: utf8 -> utf16 conversion problem ?
Hello all.
I am wondering if anyone can help me with the following little xml/unicode curiosity:
I have a Xerxec-C SAX2 parser parsing an in-memory buffer of xml encoded in utf8.
When reading the sequence C3 AA (which is 'e' with upper circumflux) the characters() method
receives 00 c3 00 aa , which is wrong !
The right value of 'e' with upper circumflux in UTF16 is 00EA.
What is going on ?
Thanks to anyone who can enlight me on this one.
Alex.
p.s. The code I am using looks like so:
someFunction(char *xml, int32_t xml_len) {
SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
//ExtractHandler is derived from ContentHandler and ErrorHandler
ExtractHandler extHandler(xml, xml_len);
parser->setContentHandler(&extHandler);
parser->setErrorHandler(&extHandler);
MemBufInputSource mbis((XMLByte*)xml, xml_len, "", false);
mbis.setCopyBufToStream(false); //not necessary to duplicate the buffer
parser->parse(mbis);
}
|