OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: Characters having an ASCII value > 127

[ Lists Home | Date Index | Thread Index ]
  • From: Richard Tobin <richard@cogsci.ed.ac.uk>
  • To: Steffen Rodig <rodig@sdm.de>, xml-dev@ic.ac.uk
  • Date: Fri, 18 Sep 1998 13:45:14 +0100 (BST)

> I guess, to correctly interpret and display those characters I have to
> know the character set which was used to encode the original text file.
> How can I communicate this character set to an XML parser?

You can do this by putting an encoding declaration in the XML
declaration at the start of the file.  For example, if the document
is in ISO Latin 1, officially named ISO-8859-1, you can use

 <?xml version="1.0" encoding="ISO-8859-1"?>

Without an encoding declaration (or a mime type if the document comes
from an http server) a conforming parser will treat it as UTF-8, and
any character above 127 will be misinterpreted.

Of course, any particular parser may not support the character set you
happen to be using.

-- Richard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS