OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Malformed UTF-8 char



9/11/01 2:30:55 AM, Raffaele Bello <raffaele.bello@eurotechsrl.it> wrote:

>   when I try to parse an xml document with character like "è" i got the
>   following error:

That's a character in the ISO-8859-1 encoding

>   org.xml.sax.SAXParseException: Character conversion error: "Malformed
>   UTF-8 char

And you get that error because the parser is expecting a character in the UTF-8 encoding.

>   The strings in the xml document are in Italian, maybe I should use a
>   different encoding?
>
>
>   I'd use
>
>   <?xml version = "1.0" encoding = "UTF-8"?>

You're telling the parser that your characters are encoded in UTF-8, but in reality they're encoded 
in ISO-8859-1.  Those two encodings have different values for all characters outside the ASCII 
subset.  If your documents are encoded in anything other than UTF-8, you need to specify the correct 
encoding in your XML declaration.