OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] US-ASCII characters versus XML characters ... why such a huge discrepancy?

You have to remember that ASCII is very old, going right back to the 
beginning of computing where there were no fancy ISO layers.  As it was sent 
directly over things like RS-232, ASCII had not only to convey the textual 
data, but also include all the framing such as beginning of packet, end of 
packet, end of message that today is done by all the layer 4(?) and below 
protocols that we have today.  XML is very much layer 5+ (not actually 
sure!), so it considers the ASCII characters from columns 0 and 1(?) that 
are intended to implement the lower layer functionality to not be relevant 
to it.

Put another way, in principle you should be able to take an XML ASCII 
encoded text message and send it over an RS-232 link that uses Start of text 
and End of text for it's own framing.  If you did that, you don't want those 
characters in your XML data.

Pete Cordell
Codalogic Ltd
Twitter: http://twitter.com/petecordell
Interface XML to C++ the easy way using C++ XML
data binding to convert XSD schemas to C++ classes.
Visit http://codalogic.com/lmx/ or http://www.xml2cpp.com
for more info

----- Original Message ----- 
From: "Costello, Roger L." <costello@mitre.org>
To: <xml-dev@lists.xml.org>
Sent: Monday, October 01, 2012 2:59 PM
Subject: [xml-dev] US-ASCII characters versus XML characters ... why such a 
huge discrepancy?

Hi Folks,

Below is a table that shows the US-ASCII characters (decimal value) in the 
left column and the right column indicates whether the character is allowed 
in XML documents.


1. Why does XML not support many of the US-ASCII characters? Of the 127 
US-ASCII characters, 28 characters are not allowed in XML documents; that 
is, 22% of the US-ASCII characters are not supported by XML.

2. I am creating an XML Schema for an RFC that allows all 127 US-ASCII 
characters. What should I do for the 28 US-ASCII characters that are 
supported by the RFC but not supported by XML?

Decimal value of
US-ASCII character | Is an XML character?
    1              |  No
    2              |  No
    3              |  No
    4              |  No
    5              |  No
    6              |  No
    7              |  No
    8              |  No
    9              |  Yes
   10             |  Yes
   11             |  No
   12             |  No
   13             |  Yes
   14             |  No
   15             |  No
   16             |  No
   17             |  No
   18             |  No
   19             |  No
   20             |  No
   21             |  No
   22             |  No
   23             |  No
   24             |  No
   25             |  No
   26             |  No
   27             |  No
   28             |  No
   29             |  No
   30             |  No
   31             |  No
   32-127    |  Yes


XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS