XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Cannot close an XML file used for parsing

It sounds like light-html2xml failed to convert the html page to well-formed XML but I tried with www.abc.com (where there is just one form element) and the resulting XML was correct...

Alain COUTHURES
<agenceXML>
http://www.agencexml.com
Bordeaux, France

Jack Bush a écrit :
603219.32794.qm@web59612.mail.ac4.yahoo.com" type="cite">
Hi All,

I appears to have difficulty closing (possibly flushing it first) an XML file that was subsequently being parsed without success. The error generated is:

org.jdom.input.JDOMParseException: Error on line 23: The element type "form" must be terminated by the matching end-tag "</form>".

Below is the code snippets of readData() to retrieve (HTML) data from a website, save it to a file, then convert to XML format before returning the new filename:
public String readData() {
 
    try {
          URL url  = new URL("http://www.abc.com");
          URLConnection connection = url.openConnection();      
          InputStream isInHtml = url.openStream();   // throws an IOException    
          disInHtml = new DataInputStream(new BufferedInputStream(isInHtml));         
          System.out.flush();
          FileOutputStream fosOutHtml = null;
          fosOutHtml = new FileOutputStream("C:\\Temp\\ABC.html");
          int oneChar, count=0;
          while ((oneChar=disInHtml.read()) != -1)
              fosOutHtml.write(oneChar);
          isInHtml.close();
          disInHtml.close();
          fosOutHtml.flush();    // optional
          fosOutHtml..close();
          .....
    }
 
    try {
          File fileInHtml = new File("C:\\Temp\\ABC.html");
          FileReader frInHtml = new FileReader(fileInHtml);
          BufferedReader brInHtml = new BufferedReader(frInHtml);
          String string = "";
          while (brInHtml.ready())
              string += brInHtml.readLine() + "\n";
          fwOutXml  = new FileWriter("C:\\Temp\\ABC.xml");
          pwOutXml  = new PrintWriter(fwOutXml);
          light_html2xml html2xml = new light_html2xml();
          pwOutXml.print(html2xml.Html2Xml(string));
          system.out.flush()     // optional
          fwOutXml.flush();      // optional
          fwOutXml.close();
          pwOutXml.flush();      // optional
          pwOutXml.close();
          return fileInHtml.getAbsolutePath();
          ....
    }
}
 
// parseData reads the XML file using the name returned by readData()
public void parseData(String XMLFilename)
{
    try
    {
        FileReader frInXml = new FileReader(FileName);
        BufferedReader brInXml = new BufferedReader(frInXml);
        SAXBuilder saxBuilder = new SAXBuilder("org.apache.xerces.parsers.SAXParser"); // JDOMParseException generated.
        ....
}
These codes would worked when they were in a single method but I have since placed some structure around them using a number methods.

This issue has risen in th past where I have been able to close the XML file prior to reading them again. However, I don't have a solution for it this time round.

I am running JDK 1.6.0_10, Netbeans 6.1, JDOM 1.1 on Windows XP platform.

Any assistance would be appreciated.

Many thanks,

Jack


Make the switch to the world's best email. Get Yahoo!7 Mail.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS