xml-dev - baffling encoding problem

baffling encoding problem

[ Lists Home | Date Index | Thread Index ]

To: xml-dev@lists.xml.org
Subject: baffling encoding problem
From: Bryan Rasmussen <bry@itnisk.com>
Date: Wed, 9 Feb 2005 13:45:44 +0100
User-agent: Internet Messaging Program (IMP) 3.2.2


Hi,
I need some insight on encoding here. I have an xml instance that has passed
through some parties before it got to me. 

I'm on a windows system. The instance has a declared encoding of UTF-8 and a
BOM
of UTF-8 (EF BB BF). 

In this xml there is a textnode that should have the word Børnehave. 
When I open it in various text editors, notepad, wordpad, it shows Brnhave.
When
I open it in various xml editors some crash saying there is an encoding
problem,
also when I open it in IE. Some editors however feel that it is correct, for
example Crimson and XML SPY, although both of them display it without the ø
sign.  When I open it in 010 editor, the hex editor i use, it shows 42F8 726E
6568 6176 6 which that's right isn't it? I copy that I get Børnehave. 

When I try to load the instance in a DOM using MSXML I get an encoding error, 
when I run it through XSV to validate I get 
 Input error: Illegal UTF-8 byte 2 <0x72> at file offset 1070
 in unnamed entity at line 18 char 27 
which is the r, of course it is at the position where the ø should be. 

If I save it in any editor that feels it is correct I will then have a file that
will open in all the other editors and in IE, load it in dom etc. 


 




-- 
Bryan Rasmussen

Follow-Ups:
- Re: [xml-dev] baffling encoding problem
  - From: Bryan Rasmussen <bry@itnisk.com>
- Re: [xml-dev] baffling encoding problem
  - From: richard@inf.ed.ac.uk (Richard Tobin)

Prev by Date: Re: [xml-dev] Re: Where does the "nothing left but toolkits" mythcome from?
Next by Date: Re: [xml-dev] baffling encoding problem
Previous by thread: RE: [xml-dev] Is this constraint expressible in XML Schema?
Next by thread: Re: [xml-dev] baffling encoding problem
Index(es):
- Date
- Thread