Lists Home |
Date Index |
- From: David Megginson <firstname.lastname@example.org>
- To: <email@example.com>
- Date: 13 Nov 1999 06:53:58 -0500
"Don Park" <firstname.lastname@example.org> writes:
> Right. This means that SML is not a good choice for 'documents' nor
> encoding data with lots of foreign characters.
Like, say, a database with the names of subscribers to a Chinese
e-mag, or a collection of information about Arabic movies.
Right now, it happens that a few large English-speaking former British
colonies (U.S., Canada, Australia, New Zealand) and Western Europe
make up a majority of the computer-using world, but since we make up a
small minority of the world in general, I expect that things will
change rapidly -- data from Turkey or Saudi Arabia or Japan or Korea
can have exactly the same problem as a document.
Actually, in the end I found that supporting UTF-16/UCS-2 as well as
UTF-8 wasn't that hard in AElfred (again, just a few lines of code,
and I didn't use the Java 1.1 library stuff). The hard part about
Unicode is that there's such a wide range of characters allowed and
not allowed in each context, and XML 1.0 *requires* parsers to report
all of those errors. That's a problem with UTF-8 as well as UTF-16.
All the best,
David Megginson email@example.com
xml-dev: A list for W3C XML Developers. To post, mailto:firstname.lastname@example.org
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:email@example.com the following message;
To subscribe to the digests, mailto:firstname.lastname@example.org the following message;
List coordinator, Henry Rzepa (mailto:email@example.com)