[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Your XML documents may use different sets of characters, depending on which implementer you select?
- From: "G. Ken Holman" <gkholman@CraneSoftwrights.com>
- To: XML-Dev Mailing list <xml-dev@lists.xml.org>
- Date: Tue, 17 May 2011 09:29:53 -0400
At 2011-05-17 09:10 -0400, Costello, Roger L. wrote:
>The XML specification mandates version 2.0 of Unicode.
Ummmmmmm .... that isn't my read.
http://www.w3.org/TR/2008/REC-xml-20081126/
Unicode
The Unicode Consortium. The Unicode Standard, Version 5.0.0, defined
by: The Unicode Standard, Version 5.0 (Boston, MA, Addison-Wesley,
2007. ISBN 0-321-48091-0).
Furthermore, it is now open-ended so as to allow future versions of
Unicode to implicitly be valid in XML documents:
Almost all characters are permitted in names, except those which
either are or reasonably could be used as delimiters. The
intention is to be inclusive rather than exclusive, so that writing
systems not yet encoded in Unicode can be used in XML names.
>Are other XML applications similarly lax, permitting implementers to
>pick which version of Unicode they will support?
>
>Does the XSLT spec allow implementers freedom to choose which
>version of Unicode they will support?
>
>Does the UBL spec allow implementers freedom to choose which version
>of Unicode they will support?
>
>Does the RELAX NG spec allow implementers freedom to choose which
>version of Unicode they will support?
>
>Does the XBRL spec allow implementers freedom to choose which
>version of Unicode they will support?
>
>Does the SVG spec allow implementers freedom to choose which version
>of Unicode they will support?
My guess is the above is all moot. As I understand it, XML allows
any Unicode character (now or defined in the future) to be
used. Those specifications would simply cite XML and not impose
Now, if the purpose of your question is to prevent a processing
system from having to process unknown or unsupported Unicode
characters, check out the ISO/IEC 19757-7 CREPDL - Character
Repertoire Description Language:
ISO/IEC 19757-7:2009 specifies a Character Repertoire Description
Language (CREPDL); a CREPDL schema describes a character repertoire.
ISO/IEC 19757-7:2009 introduces kernels and hulls of repertoires,
then specifies the syntax of CREPDL schemas and the semantics of a
correct CREPDL schema; the semantics specify when a character is in
a repertoire described by a CREPDL schema. ISO/IEC 19757-7:2009
defines CREPDL processors and their behaviour. Finally, it describes
differences of conformant CREPDL processors, and provides examples
of CREPDL schemas.
Using a character repertoire schema you can validate an instance for
having only the Unicode characters you want it to have.
I hope this helps.
. . . . . . . . . . . . . Ken
--
Contact us for world-wide XML consulting & instructor-led training
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/x/
G. Ken Holman mailto:gkholman@CraneSoftwrights.com
Legal business disclaimers: http://www.CraneSoftwrights.com/legal
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]