[
Lists Home |
Date Index |
Thread Index
]
- From: David Megginson <david@megginson.com>
- To: <xml-dev@ic.ac.uk>
- Date: 13 Nov 1999 06:53:58 -0500
"Don Park" <donpark@docuverse.com> writes:
> Right. This means that SML is not a good choice for 'documents' nor
> encoding data with lots of foreign characters.
Like, say, a database with the names of subscribers to a Chinese
e-mag, or a collection of information about Arabic movies.
Right now, it happens that a few large English-speaking former British
colonies (U.S., Canada, Australia, New Zealand) and Western Europe
make up a majority of the computer-using world, but since we make up a
small minority of the world in general, I expect that things will
change rapidly -- data from Turkey or Saudi Arabia or Japan or Korea
can have exactly the same problem as a document.
Actually, in the end I found that supporting UTF-16/UCS-2 as well as
UTF-8 wasn't that hard in AElfred (again, just a few lines of code,
and I didn't use the Java 1.1 library stuff). The hard part about
Unicode is that there's such a wide range of characters allowed and
not allowed in each context, and XML 1.0 *requires* parsers to report
all of those errors. That's a problem with UTF-8 as well as UTF-16.
All the best,
David
--
David Megginson david@megginson.com
http://www.megginson.com/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
|