OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: control characters

[ Lists Home | Date Index | Thread Index ]
  • From: Eldar Musayev <eldarm@microsoft.com>
  • To: xml-dev@xml.org
  • Date: Wed, 21 Jun 2000 10:14:34 -0700

 > Mapping 0x0000 - 0x001F to the private use area sounds like 
 > the "correct" 
 > unicode thing to do, But for US-ASCII/UTF-8 documents I 
 > would map to 0x0080 
 > - 0x009F instead.
 > This way you preserve the deprecated anglo centric 
 > english-only bigoted 
 > assumption of 1 character == 1 byte.
 > The only downside is that someone might actually have data 
 > in this range. I 
 > think this is about as likely as someone having data in the 
 > private use 
 > area.
 > -Wayne Steele

In a case you may be interested: there is a lot of charsets/encodings using
this range as well. The reason was historical: using it allowed to preserve
Latine alphabet in its' ASCII place while having national alphabets at the
same time. That's not directly related to the question, but this makes the
chance "that someone might actually have data in this range" much higher.

Eldar Musayev

This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS