Re: [xml-dev] How to parse XML document with default namespace withJDOM

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

Re: [xml-dev] How to parse XML document with default namespace withJDOM XPath

From: Philippe Poulard <philippe.poulard@sophia.inria.fr>
To: Jack Bush <netbeansfan@yahoo.com.au>
Date: Thu, 06 Nov 2008 10:10:00 +0100

hi,

another piece of code that might be useful:

<xcl:active-sheet
     xmlns:xcl="http://ns.inria.org/active-tags/xcl";
     xmlns:ns="http://www.w3.org/1999/xhtml";>
   <xcl:parse-html name="myHtml" source="file:///path/to/file.html"/>
   <xcl:for-each name="myElem" select="{ 
$myHtml/ns:html/ns:body/ns:div[@id='container']/ns:div[@id='content']/ns:table[@class='sresults']/ns:tr/ns:td/ns:a 
}">
     <xcl:echo value="{ string( $myElem ) }"/>
     <xcl:echo value="{ string( $myElem/@href ) }"/>
   </xcl:for-each>
</xcl:active-sheet>

the underlying HTML parser is not tagsoup but nekohtml; if your source 
file is XHTML and not HTML, you can simply use <xcl:parse> (an XML 
parser) instead of <xcl:parse-html> (an HTML parser)

I don't really know what happens for a pure HTML source with the XHTML 
namespace, but you can trace what have been parsed by inserting the 
following operation, for example after the for-each statement:
   <xcl:transform source="{ $myHtml }" output="{ $sys:out }"
     xmlns:sys="http://ns.inria.org/active-tags/sys"/>
for pure HTML sources, nekohtml capitalize the elements to be conform to 
the HTML spec (as stated in the nekohtml parser documentation), and it 
must be the same in the XPath expression

to launch the script, just download reflex 
(http://reflex.gforge.inria.fr/) and type in a console:
java -jar reflex-0.3.2.jar run active-sheet.xcl

-- 
Cordialement,

               ///
              (. .)
  --------ooO--(_)--Ooo--------
|      Philippe Poulard       |
  -----------------------------
  http://reflex.gforge.inria.fr/
        Have the RefleX !

References:
- How to parse XML document with default namespace with JDOM XPath
  - From: Jack Bush <netbeansfan@yahoo.com.au>
- RE: [xml-dev] How to parse XML document with default namespace with JDOM XPath
  - From: "Michael Kay" <mike@saxonica.com>
- Re: [xml-dev] How to parse XML document with default namespace with JDOM XPath
  - From: Jack Bush <netbeansfan@yahoo.com.au>
- RE: [xml-dev] How to parse XML document with default namespace with JDOM XPath
  - From: "Michael Kay" <mike@saxonica.com>
- Re: [xml-dev] How to parse XML document with default namespace with JDOM XPath
  - From: Jack Bush <netbeansfan@yahoo.com.au>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]