OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   ASCII control characters in XML

[ Lists Home | Date Index | Thread Index ]
  • From: Steve Harris <sharris@primus.com>
  • To: "'xml-dev@ic.ac.uk'" <xml-dev@ic.ac.uk>
  • Date: Tue, 28 Apr 1998 09:21:58 -0700

Is it possible to transport UTF-8-encoded text that includes some
characters in the byte range x0000-x001F (ASCII control characters)?
These codes are valid within UTF-8 (via RFC2044), but the XML
specification clearly says that these codes do not constitute 'valid
characters'. My application that wraps Clark's "expat" dies upon
encountering codes in this range, citing well-formedness violations. I'm
looking for the proper method for transporting text that occasionally
includes these codes.
I've been RTFM'ing this for a while now, and I've found plenty of
archived discussion regarding raw binary data as PCDATA content, but
this seems closer to common text-processing problem. Any advice or
further interpretation would be greatly appreciated.

Steven E. Harris
Software Engineer
1601 Fifth Avenue, Suite 1900
Seattle, Washington 98101
(206) 292-1001 x436

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS