xml-dev - Re: [xml-dev] Microsoft FUD on binary XML...

Re: [xml-dev] Microsoft FUD on binary XML...

[ Lists Home | Date Index | Thread Index ]

To: Elliotte Rusty Harold <elharo@metalab.unc.edu>
Subject: Re: [xml-dev] Microsoft FUD on binary XML...
From: Alaric B Snell <alaric@alaric-snell.com>
Date: Sat, 22 Nov 2003 23:37:41 +0000
Cc: Tony Graham <Tony.Graham@Sun.COM>, xml-dev@lists.xml.org
In-reply-to: <p06010201bbe4835029d4@[192.168.254.4]>
References: <004201c3af99$e9944010$650aa8c0@BOBDEV> <3FBDAFDA.3010905@allette.com.au> <3FBDF386.9090307@alaric-snell.com> <20031121.121152.50253888.Tony.Graham@Sun.COM> <3FBE14D8.7040405@alaric-snell.com> <p06010201bbe4835029d4@[192.168.254.4]>
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030704 Debian/1.4-1

Elliotte Rusty Harold wrote:

> 
> One should keep in mind that Chinese and similar languages are quite 
> compressed to start with, far more so than English text is. For example, 
> in UTF-8 the English word "tree" takes four bytes. The Japanese word for 
> tree takes three bytes. 
 >

Good point, actually... I suppose that, in general, any language which 
uses more than 256 code points in general use is actually quite likely 
to be a language that uses one code point per word. So languages like 
Arabic, which are alphabet-based but not very compact in UTF-8 due to 
being composed of high-numbered characters (although I'm not sure how 
high so don't know if they would mainly be 2 or 3 bytes or whatever), 
would be better served by an encoding that mainly uses a shiftable 
window with single-byte characters, I guess.

ABS

Follow-Ups:
- RE: [xml-dev] Microsoft FUD on binary XML...
  - From: "Alessandro Triglia" <sandro@mclink.it>
- Re: [xml-dev] Microsoft FUD on binary XML...
  - From: Tim Bray <tbray@textuality.com>
- Re: [xml-dev] Microsoft FUD on binary XML...
  - From: John Cowan <cowan@mercury.ccil.org>

References:
- RE: [xml-dev] Microsoft FUD on binary XML...
  - From: "Bob Wyman" <bob@wyman.us>
- Re: [xml-dev] Microsoft FUD on binary XML...
  - From: Rick Jelliffe <ricko@allette.com.au>
- Re: [xml-dev] Microsoft FUD on binary XML...
  - From: Alaric B Snell <alaric@alaric-snell.com>
- Re: [xml-dev] Microsoft FUD on binary XML...
  - From: Tony Graham <Tony.Graham@Sun.COM>
- Re: [xml-dev] Microsoft FUD on binary XML...
  - From: Alaric B Snell <alaric@alaric-snell.com>
- Re: [xml-dev] Microsoft FUD on binary XML...
  - From: Elliotte Rusty Harold <elharo@metalab.unc.edu>

Prev by Date: Re: [xml-dev] Relating to XML
Next by Date: Re: [xml-dev] Microsoft FUD on binary XML...
Previous by thread: Re: [xml-dev] Microsoft FUD on binary XML...
Next by thread: Re: [xml-dev] Microsoft FUD on binary XML...
Index(es):
- Date
- Thread