[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
An XML document is not well-formed if encoding="..." does not matchthe actual encoding of the characters in the document, right?
- From: "Costello, Roger L." <costello@mitre.org>
- To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
- Date: Fri, 28 Dec 2012 20:37:40 +0000
Thanks Chris for pointing us to that article: XML on the Web has Failed
I am making my way through it.
This statement in the article piqued my interest:
... determining the actual character encoding of an
XML document is a prerequisite for determining its
well-formedness ...
I decided to do an experiment.
I created this XML document and encoded each character in the document using the iso-8859-1 encoding and in the encoding="..." I asserted that I am using the iso-8859-1 encoding:
<?xml version="1.0" encoding="iso-8859-1"?>
<Name>López</Name>
I checked the document for well-formedness and the XML parser said it is well-formed.
Good.
Then I changed encoding="iso-8859-1" to encoding="utf-8":
<?xml version="1.0" encoding="utf-8"?>
<Name>López</Name>
I checked it for well-formedness and the parser said it is still well-formed.
Huh?
Shouldn't I have gotten a well-formedness error?
I did my experiment using the latest version of Oxygen XML. I think that it uses the Xerces XML Parser, right?
Is this a bug in Xerces?
/Roger
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]