[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
RE: [xml-dev] Your XML documents may use different sets ofcharacters, depending on which implementer you select?
- From: Liam R E Quin <liam@w3.org>
- To: "Costello, Roger L." <costello@mitre.org>
- Date: Tue, 17 May 2011 17:07:25 +0200
On Tue, 2011-05-17 at 10:49 -0400, Costello, Roger L. wrote:
[...]
> The following statements apply to "data" not to "markup" (i.e.,
> element names, attribute names).
>
> 1. Except for unpaired surrogate codepoints and a few control
> characters, you can use any character you want in XML documents.
In particular, codepoint 0 is not allowed.
> 2. The characters don't have to be defined in the Unicode
> specification.
The codepoints do not have to have Unicode characters associated with
them.
>
> 3. For characters that don't have a visual representation or aren't in
> the Unicode character set, you can use them via XML's character
> entity mechanism, e.g., ■
You can do that with any allowed character, and you can also include the
character directly.
>
> 4. Implementers of XML applications are free to choose which version
> of Unicode they will support. Thus, one implementer of an XML Schema
> validator may choose to support Unicode 2.0, while another implementer
> of an XML Schema validator may choose to support Unicode 2.1. One
> implementer of an XSLT processor may choose to support Unicode 2.0,
> while another implementer of an XSLT processor may choose to support
> Unicode 2.1.
Or the version of Unicode understood may depend on the operating
environment, e.g. on the Java VM in use.
>
> 5. In XML applications that use regular expressions (e.g. XML Schema,
> XSLT), be careful about using regexes that contain regex categories
> such as Nd. The characters in those regex categories may vary
> depending on which version of Unicode an implementer supports. Thus,
> your application may execute without errors with one vendor's tool and
> fail on another.
That may be what you want, it turns out. "When our system is upgraded
our schema is ready for it"...
> 6. CREPDL is a technology that allows you to precisely define the
> universe of characters that you want to allow in your XML documents.
You can also do this with an XSD facet.
Liam
--
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://www.fromoldbooks.org/
Occasional blog: http://www.barefootliam.org/
The barefoot typographer
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]