OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: Locale Example for SAX(Sun's XML parser)

[ Lists Home | Date Index | Thread Index ]
  • From: David Brownell <david-b@pacbell.net>
  • To: xml-dev@ic.ac.uk
  • Date: Mon, 10 May 1999 07:44:35 -0700

Lars Marius Garshol wrote:
> 
> * Mi-Jeong Koo
> |
> | I'm a Korean and developing a project using the 'Project X'.
> | I'm trying to process tag names written in Korean characters but I have
> | some problem.
> | Can I get some example codes about locale?
> 
> The SAX Locale shouldn't affect things like that. Which characters are
> allowed in element type names is written is defined in the XML
> recommendation and parsers just have to follow that.
> 
> The Locale is more intended for things like localized error messages
> and so on.

And since Sun doesn't provide a resource with Korean localizations
(a com/sun/xml/parser/resources/Messages_ko.java file), if you set
the Korean locale for diagnostics you'll just see the message IDs
rather than diagnostics in Korean.  (You have source, and could
provide such a resource file if you like.)


> So I would suggest that you look in the XML recommendation to see
> which characters you're allowed to use and then either file a bug
> report, switch parser and/or change to using legal characters.

The usual problems I've seen relate to character encodings.  If you
don't use UTF-8 or UTF-16 (not many editors do, yet :-) then you must
declare the encoding at the beginning of each file, perhaps something
like

    <?xml version='1.0' encoding='EUC-KR'?>

or "ISO-2022-KR" etc.  (Perhaps "cp949", for a PC-oriented encoding?)

The official list of encoding name supported by Java is linked through
the package docs for the parser, at

   http://java.sun.com/products/jdk/1.2/docs/guide/internat/encoding.doc.html

One problem you may have is that some of the standard encoding names
are not recognized, even for encodings which _are_ supported.  A bug
has been filed against the Java i18n support, but I don't know when
the more standard names will get better support.  So if the standard
encoding names don't work, use the ones listed in the URL above.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS