OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: Characters having an ASCII value > 127

[ Lists Home | Date Index | Thread Index ]
  • From: Toby Speight <tms@ansa.co.uk>
  • To: "XML developers' list" <xml-dev@ic.ac.uk>
  • Date: 18 Sep 1998 14:06:28 +0100

Steffen> Steffen Rodig <URL:mailto:rodig@sdm.de>

0> In article <199809181228.OAA16525@sunfi1.fi.sdm.de>, Steffen wrote:

Steffen> imagine a plain text file which I want to markup using
Steffen> XML. Now it could be that there are characters in this file
Steffen> whose ASCII value is greater than 127 (in PCDATA sections).

No character has an ASCII value greater than 127: ASCII is a 7-bit
encoding.  Of course, it's possible to use characters beyond ASCII,
since the Document Character Set for XML is Unicode.


Steffen> If I try to use expat on the generated XML file, it tells
Steffen> me that it is not wellformed at the position where such a
Steffen> character occurs.

Perhaps your XML declaration doesn't agree with the actual encoding
of the document (you don't say what either of these are for your
document).  See Sections 2.8 and 4.3.3, and Appendix F.


Steffen> I guess, to correctly interpret and display those characters
Steffen> I have to know the character set which was used to encode the
Steffen> original text file.

Of course - the parser is unlikely to be able to tell the difference
between the various parts of ISO 8859, for instance.


Steffen> How can I communicate this character set to an XML parser?

In the encoding declaration, <?xml encoding="utf-8"?> (or whatever).

You may prefer to write the problematic characters as entities or
character references, if they are rare in your source.  This may
allow you to write your documents in a smaller character set.  (As an
example, I find it easiest to author in ISO-8859-1, but I need to
define entities for the Welsh characters, which lie in the Latin-2
plane.)

-- 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS