OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] An XML document is not well-formed if encoding="..."does not match the actual encoding of the characters in the document, right?

Hi Folks,

I spoke with George Cristian Bina from oXygen XML and he gave me the scoop on how things work inside oXygen.

George told me to do this:

1. Create an iso-8859-1 encoded XML file.

2. Using a hex editor, change encoding="iso-8859-1" to encoding="utf-8"

3. Drag and drop the file into oXygen.

4. oXygen will generate an encoding exception: 

    Cannot open the specified file. Got a character 
    encoding exception [snip]

Next, here is something George told me. It is mind-blowing:

    If you have an iso-8859-1 encoded XML file loaded into oXygen 
    and change encoding="iso-8859-1" to encoding="utf-8" then 
    oXygen will automatically change the encoding of every character 
    in the document to UTF-8.


That is so fantastic, I jumped out of my chair when I read it.

I just received this additional information from George:

    Please note that the encoding is important only when the file is loaded 
    and saved. When the file is loaded the bytes are converted to characters 
    and then the application works only with characters. When the file is 
    saved then those characters need to be converted to bytes and the 
    encoding used will be determined from the XML header with a default to 
    UTF-8 if no encoding can be detected.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS