Lists Home |
Date Index |
So your problem is
- the XLST generates the correct HTML data, which is labelled correctly
- when displayed by browsers, it is OK
- when opening in Win2K Notepad, the Japanese characters are represented
by a ?
I have just tested my Win2K Notepad with some UTF-8 with kanji, and
it works OK. The only trick is to make sure you have selected a font
with the kanji. Unless you are working in a CJK country, the default
fonts on your system probably do not have kanji. If your system has
[Arial Unicode MS] then try that.
You can see whether any of your fonts have kanji on Win2K by going
set "character set" to [Unicode]
set "group by" to [Unicode subrange]
in the box that appears select [CJK Ideographs]
You can use this progressively to go through every font
to find a suitable one.
P.S. If you find Notepad a little tedious, you might like to try my company's
editor, the Topologi Collaborative Markup Editor. It supports Japanese
text, as well as validation of XSLT, and gives a little more help with
figuring out encoding issues. http://www.topologi.com/
P.P.S Developers who wish to test their XML tools with CJK (Chinese/
Korean/Japanese) data may find the Academia Sinica "Chinese XML Now!"
test suite useful. It is at
In particular, a very simple test file is
----- Original Message -----
From: "asim" <firstname.lastname@example.org>
To: "Rick Jelliffe" <email@example.com>
Sent: Tuesday, March 04, 2003 7:27 AM
Subject: Re: [xml-dev] getting undisered results , when trying to display japanese characters
Hello Rick , thankx for answering , i checked my xsl file by opening it in
binary editor of visual studio.net , the one japanese char is takinf 3 bytes
rick the problem is that if i see the xsl file directly in browser it shoes
me the japanese characters. and also if i send this characters in any xml
packet and then if i transform it using msxml parser 4.0 and display the
value of that packet even there the html is generated and i can see the
japanese characters fine.
PLz help :)
----- Original Message -----
From: "Rick Jelliffe" <firstname.lastname@example.org>
Sent: Sunday, March 02, 2003 1:58 AM
Subject: Re: [xml-dev] getting undisered results , when trying to display
> From: "asim" <email@example.com>
> > this my XSL file with japnese characters. if I transform it (my
transformation > funtions is wriiten bellow the file) it shows me "?"
question marks , plz help, > i did saved this file as UTF-8 and using win2k
> HTML is horrible to work with, for multilingual work.
> There are several places where problems can creep in:
> 1) The browser is using the wrong encoding. Check whether
> your browser has been set to "auto-detect" the encoding, or
> whether it is fixed to some other encoding. (In this case, the
> "?" means "unexpected code".)
> 2) Your system may have fonts installed which do not have
> the Japanese characters. This is less likely nowdays, but
> still can happen. (In this case, the "?" means "unavailable
> 3) You are reading the files over the web, and the webserver
> is not labelling the data correctly. You need to check your
> web-server's documentation for this, for example to set the
> .htaccess file correctly if you are using Apache. (In this case,
> the "?" means "unexpected code".)
> I hope these are some use. A systematic approach is better than
> trial and error: in a HEX editor, look at the HTML file your XSLT
> script produces:-- if the Japanese characters each take three bytes
> where all the bytes are > 0x80, then your file is indeed UTF-8
> and you can concentrate on the HTTP and browser side of things.
> If the Japanese characters take two characters each, then it is not
> UTF-8 and you need to look at your XSLT code and implementation.
> Rick Jelliffe
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> The list archives are at http://lists.xml.org/archives/xml-dev/
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>