OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] getting undisered results , when trying to display japane

[ Lists Home | Date Index | Thread Index ]

I think maybe you are going around in circles.   

We established before that
 1) The HTML generated could be read OK by web browsers
       (from your comment)
 2) The HTML generated was labelled as UTF-8 (from your comment)
 3) The HTML generated was in fact UTF-8 (you confirmed by inspection
   that Japanese characters came out wrong)

So that points to Notepad as causing the problem, not XSLT.    Did you try 
different fonts?    Have you tried opening the file in Wordpad or other 
editors?    Or is there some other information that suggests that XSLT
is causing the problem?

Another issue that you need to double check is that the Japanese characters
that are being drawn are in fact the correct ones. For example, if you
have a UTF-8 file it is possible that the file could be read in by an editor
using the wrong encoding but still display (bogus) Japanese characters. 
Just because there are Japanese characters does not mean that they are
not bogus.  

The approach is to divide and conquer: first figure out whether the problem
is before the HTML or after the HTML.  I thought we had established it
was after.  

By the way, why are you using Notepad to establish that a file
is correctly encoded?  It does not even tell you what input encoding
the file was opened with....

Cheers
Rick Jelliffe



----- Original Message ----- 
From: "asim" <qazi@advcomm.net>
To: "Rick Jelliffe" <ricko@allette.com.au>
Cc: <xml-dev@lists.xml.org>
Sent: Wednesday, March 05, 2003 4:14 PM
Subject: Re: [xml-dev] getting undisered results , when trying to display japanese characters


Hi Rick
    <?xml version="1.0" encoding="iso-8859-1"?>
 After giving this tag on my xsl files i can see maximum japanese characters
fine but again some of them r missing ,dont know what is that problem

I  save them using visual studio.net save as encoding and i choose
UTF-8(without signature), xsl transformation goes fine, but i cant see some
of the japanese characters and they r replaced by "." , what could be the
problem or tell me what method/encoding and save method  shuold i adapt so
that i can see all japanese characters fine

Thankx
Qazi Asim


----- Original Message -----
From: "Rick Jelliffe" <ricko@allette.com.au>
To: "asim" <qazi@advcomm.net>
Cc: <xml-dev@lists.xml.org>
Sent: Monday, March 03, 2003 1:07 AM
Subject: Re: [xml-dev] getting undisered results , when trying to display
japanese characters


> So your problem is
>
> - the XLST generates the correct HTML data, which is labelled correctly
> - when displayed by browsers, it is OK
> - when opening in Win2K Notepad, the Japanese characters are represented
>   by a ?
>
> I have just tested my Win2K Notepad with some UTF-8 with kanji, and
> it works OK.  The only trick is to make sure you have selected a font
> with the kanji.   Unless you are working in a CJK country, the default
> fonts on your system probably do not have kanji.     If your system has
> [Arial Unicode MS] then try that.
>
> You can see whether any of your fonts have kanji on Win2K by going
>     Start>Programs>Accessories>Character Map
> then
>    set "character set" to [Unicode]
> then
>    set "group by" to [Unicode subrange]
> then
>    in the box that appears select [CJK Ideographs]
>
> You can use this progressively to go through every font
> to find a suitable one.
>
> Cheers
> Rick Jelliffe
>
> P.S. If you find Notepad a little tedious, you might like to try my
company's
> editor, the Topologi Collaborative Markup Editor. It supports Japanese
> text, as well as validation of XSLT, and gives a little more help with
> figuring out encoding issues.  http://www.topologi.com/
>
> P.P.S Developers who wish to test their XML tools with CJK (Chinese/
> Korean/Japanese) data may find the Academia Sinica "Chinese XML Now!"
> test suite useful.  It is at
> http://www.ascc.net/xml/test/en/utf-8/index.html
> In particular, a very simple test file is
> http://www.ascc.net/xml/test/wfns/utf-8/text_xml/zh-utf8-8.xml
>
>
> ----- Original Message -----
> From: "asim" <qazi@advcomm.net>
> To: "Rick Jelliffe" <ricko@allette.com.au>
> Cc: <xml-dev@lists.xml.org>
> Sent: Tuesday, March 04, 2003 7:27 AM
> Subject: Re: [xml-dev] getting undisered results , when trying to display
japanese characters
>
>
> Hello Rick , thankx for answering , i checked my xsl file by opening it in
> binary editor of visual studio.net , the one japanese char is takinf 3
bytes
> ie 家
>
> rick the problem is that if i see the xsl file directly in browser it
shoes
> me the japanese characters. and also if i send this characters in any xml
> packet and then if i transform it using msxml parser 4.0 and display the
> value of that packet even there the html is generated and i can see the
> japanese characters fine.
>
> PLz help :)
>
> ----- Original Message -----
> From: "Rick Jelliffe" <ricko@allette.com.au>
> To: <xml-dev@lists.xml.org>
> Sent: Sunday, March 02, 2003 1:58 AM
> Subject: Re: [xml-dev] getting undisered results , when trying to display
> japanese characters
>
>
> > From: "asim" <qazi@advcomm.net>
> >
> >
> > > this my XSL file with japnese characters. if I transform it (my
> transformation > funtions is wriiten bellow the file)  it shows me "?"
> question marks , plz help, > i did saved this file as UTF-8 and using
win2k
> notepad.
> >
> > HTML is horrible to work with, for multilingual work.
> >
> > There are several places where problems can creep in:
> >
> > 1) The browser is using the wrong encoding.  Check whether
> > your browser has been set to "auto-detect" the encoding, or
> > whether it is fixed to some other encoding.   (In this case, the
> > "?" means "unexpected code".)
> >
> > 2) Your system may have fonts installed which do not have
> > the Japanese characters.  This is less likely nowdays, but
> > still can happen.  (In this case, the "?" means "unavailable
> > character")
> >
> > 3) You are reading the files over the web, and the webserver
> > is not labelling the data correctly. You need to check your
> > web-server's documentation for this, for example to set the
> > .htaccess file correctly if you are using Apache. (In this case,
> > the "?" means "unexpected code".)
> >
> > I hope these are some use.  A systematic approach is better than
> > trial and error: in a HEX editor, look at the HTML file your XSLT
> > script produces:-- if the Japanese characters each take three bytes
> > where all the bytes are > 0x80, then your file is indeed UTF-8
> > and you can concentrate on the HTTP and browser side of things.
> > If the Japanese characters take two characters each, then it is not
> > UTF-8 and you need to look at your XSLT code and implementation.
> >
> > Cheers
> > Rick Jelliffe
> >
> > -----------------------------------------------------------------
> > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> > initiative of OASIS <http://www.oasis-open.org>
> >
> > The list archives are at http://lists.xml.org/archives/xml-dev/
> >
> > To subscribe or unsubscribe from this list use the subscription
> > manager: <http://lists.xml.org/ob/adm.pl>
>
>
>
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS