[
Lists Home |
Date Index |
Thread Index
]
- From: Tony Graham <tgraham@mulberrytech.com>
- To: <xml-dev@ic.ac.uk>
- Date: Sat, 13 Nov 1999 14:11:25 -0400 (EST)
At 13 Nov 1999 15:46 -0000, Richard Anderson wrote:
> But UTF-8 can support "foreign" characters so I dont see the argument for
> having UTF-16 too. Also, generally speaking UTF-8 encoding results in
> smaller output for most cases.
Different people have different ideas of what constitutes "foreign".
For the majority of the characters in the Unicode Standard, UTF-8 uses
three bytes per character. However, for the US-ASCII characters, it
uses only one byte per character.
For all characters in the Unicode Standard, UTF-16 uses two bytes per
character.
Whether a given file is less bytes as UTF-8 or UTF-16 is largely a
function of the proportion of unaccented Latin characters in the file.
Moreover, most legacy encodings for a single script use one byte per
character, although Chinese, Japanese, and Korean encodings use two or
more bytes per character. UTF-8, therefore, isn't as efficient as the
legacy encodings of most scripts. (Its advantage is that it can
represent more scripts than any legacy encoding.)
Regards,
Tony Graham
======================================================================
Tony Graham mailto:tgraham@mulberrytech.com
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9632
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
|