XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Your XML documents may use different sets of characters, dependingon which implementer you select?

Hi Folks,

The XML specification lists the set of characters that may be used in XML documents.

The characters are Unicode characters.

Unicode has something called categories. A category is a set of characters.

Here is a category: Nd

The Nd category consists of decimal digit characters.

Unicode is an evolving standard. Thus, there are different versions. 

The set of decimal digit characters in the Nd category may vary, depending on the version of Unicode.

The XML specification says that XML documents can use the characters in the Nd category.

But, but, but, ...

The characters in the Nd category may vary, depending on the version of Unicode. Do we have an ever-changing base of characters that are permitted in XML documents?

The XML specification mandates version 2.0 of Unicode.

Phew! That removes the variability in set of characters that may be used in XML documents.

But wait! 

There are XML applications that build on top of XML. And some of those applications are lax about which version of Unicode must be used. For example, with XML Schema:

    As far as conformant processors are concerned, the spec offers 
   implementers freedom to choose which version of Unicode they will 
   support. So if the definitions of character groups like Nd change from 
   one Unicode version to the next, this may be reflected in differences 
   between schema processors. [1]

So, one XML Schema validator may support Unicode 2.0 and another XML Schema validator may support Unicode 2.1. Suppose that in Unicode 2.0 there are 600 characters in the Nd category and in Unicode 2.1 there are 610 characters in the Nd category. An XML instance document may validate against one validator and fail against another. 

Ouch!

Are other XML applications similarly lax, permitting implementers to pick which version of Unicode they will support?

Does the XSLT spec allow implementers freedom to choose which version of Unicode they will support?

Does the UBL spec allow implementers freedom to choose which version of Unicode they will support?

Does the RELAX NG spec allow implementers freedom to choose which version of Unicode they will support?

Does the XBRL spec allow implementers freedom to choose which version of Unicode they will support?

Does the SVG spec allow implementers freedom to choose which version of Unicode they will support?

/Roger

[1] http://lists.w3.org/Archives/Public/xmlschema-dev/2011May/0024.html 




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS