Lists Home |
Date Index |
- From: "Steve Kearon" <firstname.lastname@example.org>
- To: <email@example.com>
- Date: Tue, 15 Dec 1998 10:07:56 -0000
Can someone clarify the issue of character encodings for me - I think this
is an expat issue, but it may be a more general thing.
I'm trying to save/load text that might contain accented characters (>127).
Running on Windows95. I realise that when writing XML, I either have to
convert such characters to "&#xxx;" form, or note that the file format
encoding is "iso-8859-1", otherwise the XML parser (expat)objects when
subsequently reading the file.
The snag is that whether the file has utf-8 or iso-8859-1 encoding, the text
the application receives from the parser seems to be always utf-8. I've
tried specifying "iso-8859-1" as the encoding to the XML_CreateParser()
call, but this seems to have no effect (I guess the parameter actually
overrides the default (rtf-8) file encoding, rather than specifying the
encoding the client would like to see).
Is my understanding correct - does expat feed UTF-8 text to clients when
Can expat be asked to feed clients iso-8859-1?
If the client must convert manually, are there any helper functions in
If I use the unicode build of expat, does it feed utf-8, unicode or utf-16?
xml-dev: A list for W3C XML Developers. To post, mailto:firstname.lastname@example.org
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:email@example.com the following message;
To subscribe to the digests, mailto:firstname.lastname@example.org the following message;
List coordinator, Henry Rzepa (mailto:email@example.com)