OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: encoding problem fixed

[ Lists Home | Date Index | Thread Index ]
  • From: John Cowan <cowan@locke.ccil.org>
  • To: XML Dev <xml-dev@ic.ac.uk>
  • Date: Fri, 30 Jul 1999 10:59:58 -0400

James Tauber wrote:

> In other words, rather than creating an InputSource using a FileReader, I
> used James Clark's "fileInputSource" method in XT to make a URL out of a
> file and create the InputSource from the URL string.

Yes, indeed.  You should never use a Reader of any sort when processing
XML (unless you have a non-standard Reader class that understands the
XML declaration).  Always use an InputSource so that the parser can
install its own bytes-to-chars converter based on the declaration.
 
> The culprit is FileReader. It is the one doing the strange "read UTF-8 as
> Windows code page".

Actually, it's doing what it's expected to: reading the native charset,
CP-1252.  (Unix JVMs use 8859-1 instead.)  It has no way of knowing that
*you* think the document charset is UTF-8.

-- 
	John Cowan	http://www.ccil.org/~cowan	cowan@ccil.org
Schlingt dreifach einen Kreis um dies! / Schliesst euer Aug vor heiliger Schau,
Denn er genoss vom Honig-Tau / Und trank die Milch vom Paradies.
			-- Coleridge / Politzer

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS