OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] getting undisered results , when trying to display japane

[ Lists Home | Date Index | Thread Index ]

So your problem is

- the XLST generates the correct HTML data, which is labelled correctly
- when displayed by browsers, it is OK
- when opening in Win2K Notepad, the Japanese characters are represented
  by a ?

I have just tested my Win2K Notepad with some UTF-8 with kanji, and
it works OK.  The only trick is to make sure you have selected a font
with the kanji.   Unless you are working in a CJK country, the default
fonts on your system probably do not have kanji.     If your system has
[Arial Unicode MS] then try that.

You can see whether any of your fonts have kanji on Win2K by going
    Start>Programs>Accessories>Character Map 
then
   set "character set" to [Unicode]
then
   set "group by" to [Unicode subrange]
then 
   in the box that appears select [CJK Ideographs]

You can use this progressively to go through every font
to find a suitable one. 

Cheers
Rick Jelliffe

P.S. If you find Notepad a little tedious, you might like to try my company's
editor, the Topologi Collaborative Markup Editor. It supports Japanese
text, as well as validation of XSLT, and gives a little more help with
figuring out encoding issues.  http://www.topologi.com/

P.P.S Developers who wish to test their XML tools with CJK (Chinese/
Korean/Japanese) data may find the Academia Sinica "Chinese XML Now!"
test suite useful.  It is at
http://www.ascc.net/xml/test/en/utf-8/index.html
In particular, a very simple test file is
http://www.ascc.net/xml/test/wfns/utf-8/text_xml/zh-utf8-8.xml


----- Original Message ----- 
From: "asim" <qazi@advcomm.net>
To: "Rick Jelliffe" <ricko@allette.com.au>
Cc: <xml-dev@lists.xml.org>
Sent: Tuesday, March 04, 2003 7:27 AM
Subject: Re: [xml-dev] getting undisered results , when trying to display japanese characters


Hello Rick , thankx for answering , i checked my xsl file by opening it in
binary editor of visual studio.net , the one japanese char is takinf 3 bytes
ie 家

rick the problem is that if i see the xsl file directly in browser it shoes
me the japanese characters. and also if i send this characters in any xml
packet and then if i transform it using msxml parser 4.0 and display the
value of that packet even there the html is generated and i can see the
japanese characters fine.

PLz help :)

----- Original Message -----
From: "Rick Jelliffe" <ricko@allette.com.au>
To: <xml-dev@lists.xml.org>
Sent: Sunday, March 02, 2003 1:58 AM
Subject: Re: [xml-dev] getting undisered results , when trying to display
japanese characters


> From: "asim" <qazi@advcomm.net>
>
>
> > this my XSL file with japnese characters. if I transform it (my
transformation > funtions is wriiten bellow the file)  it shows me "?"
question marks , plz help, > i did saved this file as UTF-8 and using win2k
notepad.
>
> HTML is horrible to work with, for multilingual work.
>
> There are several places where problems can creep in:
>
> 1) The browser is using the wrong encoding.  Check whether
> your browser has been set to "auto-detect" the encoding, or
> whether it is fixed to some other encoding.   (In this case, the
> "?" means "unexpected code".)
>
> 2) Your system may have fonts installed which do not have
> the Japanese characters.  This is less likely nowdays, but
> still can happen.  (In this case, the "?" means "unavailable
> character")
>
> 3) You are reading the files over the web, and the webserver
> is not labelling the data correctly. You need to check your
> web-server's documentation for this, for example to set the
> .htaccess file correctly if you are using Apache. (In this case,
> the "?" means "unexpected code".)
>
> I hope these are some use.  A systematic approach is better than
> trial and error: in a HEX editor, look at the HTML file your XSLT
> script produces:-- if the Japanese characters each take three bytes
> where all the bytes are > 0x80, then your file is indeed UTF-8
> and you can concentrate on the HTTP and browser side of things.
> If the Japanese characters take two characters each, then it is not
> UTF-8 and you need to look at your XSLT code and implementation.
>
> Cheers
> Rick Jelliffe
>
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS