RE: [xml-dev] Storing illegal XML 1.0 characters in the UnicodePrivate U

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

RE: [xml-dev] Storing illegal XML 1.0 characters in the UnicodePrivate Use Area

From: "Costello, Roger L." <costello@mitre.org>
To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
Date: Fri, 2 Nov 2012 15:27:58 +0000

Liam, hello,

> Another is to use an escaping mechanism - 
> what I call "UCODE", in which an upper-case 
> "U" is followed by hexadecimal and a trailing 
> X to mark the end

To be sure I understand, this text (2 denotes hex two, 3 denotes hex three):

    2Hello World3

is converted to this XML:

   <text>U2XHello WorldU3X</text>

Is that the approach Liam? 

Have you documented this approach anywhere?

/Roger

-----Original Message-----
From: Liam R E Quin [mailto:liam@w3.org] 
Sent: Friday, November 02, 2012 11:19 AM
To: Costello, Roger L.
Cc: xml-dev@lists.xml.org
Subject: Re: [xml-dev] Storing illegal XML 1.0 characters in the Unicode Private Use Area

On Wed, 2012-10-31 at 18:04 +0000, Costello, Roger L. wrote:
[...]
> One approach is to move any illegal characters into the Private Use Area: 
Another is to use an escaping mechanism - e.g. what I call "UCODE", in
which an upper-case "U" is followed by hexadecimal and a trailing X to
mark the end; this can safely be used in XML element names for example.

Liam

-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Ankh: irc.sorcery.net irc.gnome.org freenode/#xml
Co-author, 5th edition of "Beginning XML" - Wrox, 2012

Follow-Ups:
- RE: [xml-dev] Storing illegal XML 1.0 characters in the UnicodePrivate Use Area
  - From: Liam R E Quin <liam@w3.org>

References:
- Re: [xml-dev] Storing illegal XML 1.0 characters in the UnicodePrivate Use Area
  - From: Liam R E Quin <liam@w3.org>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]