java.io.UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8
sequence.
There's only one
explanation of that: the parser is expecting the document to be encoded in
UTF-8 but it isn't. To understand why it isn't, you need to examine how the
document was created and any transcodings that might have taken place before
it reached the parser.
at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown
Source)
at
org.apache.xerces.impl.io.UTF8Reader.read(Unknown
Source)
at
org.apache.xerces.impl.XMLEntityScanner.load(Unknown
Source)
at
org.apache.xerces.impl.XMLEntityScanner.skipChar(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
at
org.apache..xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at
org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
at
org.apache.xerces..parsers.XML11Configuration.parse(Unknown
Source)
at
org.apache.xerces.parsers.XMLParser.parse(Unknown
Source)
at
org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown
Source)
at
org.jdom.input.SAXBuilder.build(SAXBuilder.java:489)
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:928)
at
JDOMTrAXPojoInvestmentBean.main(JDOMTrAXPojoInvestmentBean.java:45)
The header of state.xml is as follows:
<?xml version="1.0" encoding="UTF-8"
?>
<!DOCTYPE html (View Source for full
doctype...)>
- <html
xmlns="http://www.w3.org/1999/xhtml"
xmlns:html="http://www.w3.org/1999/xhtml"> Any ideas on what is the cause of this
issue and how to overcome it? Likewise, how to define the correct proper
namespace prefix? Is it possible that this document has two namespaces. A
default one and one with prefix 'html'? If so, which one should I use?