[
Lists Home |
Date Index |
Thread Index
]
We are generating the XML document from the data taken from datbase tables and
the default characterset encoding for database is ISO-8859-1.
We updated the database column with a value containing the danish characters
and had hardcoded the encoding tag as "ISO-8859-1". Still facing the same
exception "Invalid UTF-8 encoding".
What can be the reason for this.
"Christopher R. Maden" wrote:
> At 22:53 16-12-2001, Neeraja Divakaruni wrote:
> >We are getting an exception while parsing the XML document using Oracle
> >parser "parseCLOB" procedure.. The exception is "5-byte UTF-8 encoding
> >not supported".
> >
> >One more observation is the other foreign characters like æ , Æ , Ø (
> >these are also danish characters) etc we are getting ane exception "
> >Invalid UTF8 encoding".
> >What can be the possible causes for these two exceptions ?? Please do
> >respond..
>
> It sounds like the parser is trying to parse the CLOB as UTF-8 despite its
> actual encoding. The character "ø" is 0xF8 (11111000) in ISO 8859-1, which
> would be interpreted as the start of a 5-byte UTF-8 sequence; the other
> characters you mention are not valid UTF-8 sequence starters.
>
> What code are you using to parse the CLOB and to set the encoding? I
> suspect that, rather than simply inserting an XML declaration in the CLOB,
> you need to actually instruct the parser what encoding to use for reading
> the input.
>
> ~Chris
> --
> Christopher R. Maden, Principal Consultant, HMM Consulting Int'l, Inc.
> DTDs/schemas - conversion - ebooks - publishing - Web - B2B - training
> <URL: http://www.hmmci.com/ > <URL: http://crism.maden.org/consulting/ >
> PGP Fingerprint: BBA6 4085 DED0 E176 D6D4 5DFC AC52 F825 AFEC 58DA
>
> ------------------------------------------------------------------------
> Part 1.2Type: application/pgp-signature
--
Neeraja D
Applications Engineer,
Applications Technology Group
Oracle Software India Ltd.
Work : +91 (40) 311 0222 Extn. : 4067
Email : neeraja.divakaruni@oracle.com
|