[
Lists Home |
Date Index |
Thread Index
]
- From: "Fabio Arciniegas A." <l-arcini@uniandes.edu.co>
- To: xml-dev@ic.ac.uk
- Date: Tue, 4 Jan 2000 13:37:45 -0500 (GMT+5)
<snip/>
> Note that Java uses UTF-16, which isn't quite fixed-width, though no
> one really notices.
Err... David, I thought Java used UTF-8, actually a version slightly
different from the "typical" version that expresses:
Characters in the range \u0001 to \u007F in one byte: 0[bits 0-6]
Characters in the range \u0080 to \u07FF and \u0000 in two bytes:
110[bits 7 -10] 10[bits 0-6]
Characters in the range \u0800 to \uFFFF in three bytes: 1110[bits 12-15]
10[bits 6-11] 10[bits 0-5]
(what's different from typical is that NULL is in two bytes, so there's no
embedded nulls in java vm strings)
....
However, It has been quite a while since the last time I looked... Have
this changed in latest versions?
Best,
Fabio
--
Fabio Arciniegas A. Viaduct Technologies, Inc.
fabio@viaduct.com Software Engineer
Interests: XML, Wittgenstein and just about everything in between.
Oblique Strategy of the day: "Abandon normal instruments"
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
|