xml-dev - RE: control characters

RE: control characters

[ Lists Home | Date Index | Thread Index ]

From: Eldar Musayev <eldarm@microsoft.com>
To: xml-dev@xml.org
Date: Wed, 21 Jun 2000 10:14:34 -0700

 > Mapping 0x0000 - 0x001F to the private use area sounds like 
 > the "correct" 
 > unicode thing to do, But for US-ASCII/UTF-8 documents I 
 > would map to 0x0080 
 > - 0x009F instead.
 > This way you preserve the deprecated anglo centric 
 > english-only bigoted 
 > assumption of 1 character == 1 byte.
 > 
 > The only downside is that someone might actually have data 
 > in this range. I 
 > think this is about as likely as someone having data in the 
 > private use 
 > area.
 > -Wayne Steele

In a case you may be interested: there is a lot of charsets/encodings using
this range as well. The reason was historical: using it allowed to preserve
Latine alphabet in its' ASCII place while having national alphabets at the
same time. That's not directly related to the question, but this makes the
chance "that someone might actually have data in this range" much higher.

Eldar Musayev

***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************

Follow-Ups:
- Re: control characters
  - From: John Cowan <jcowan@reutershealth.com>

Prev by Date: Re: Data Structures and XML
Next by Date: XSL question
Previous by thread: Re: control characters
Next by thread: Re: control characters
Index(es):
- Date
- Thread