XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Doesn't the list of allowable characters shown in the XMLspecification assume a Unicode character encoding scheme? What if the XMLisn't using Unicode?

Hi Folks,

The XML specification says that these are the codepoints for the characters that are allowed in XML documents:

Char	   ::=   	#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]	

But, but, but, ....

Doesn't that list of codepoints assume the XML documents are encoded using a Unicode character encoding scheme? 

What if the XML documents aren't encoded using a Unicode character encoding scheme, then what are the allowable characters? 

For example, in Unicode the codepoint #x9 corresponds to the "horizontal tab" character but in EBCDIC hex 9 corresponds to the "begin superscript" character. Is the XML specification saying that an XML document using EBCDIC can use the invisible "begin superscript" character but not the "horizontal tab" character? Or, is it saying that am I expected, when using a character encoding scheme other than Unicode, to convert the above list of Unicode codepoints to the corresponding characters in the non-Unicode character encoding scheme? For example, in EBCDIC the "horizontal tab" character is 5.

/Roger


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS