[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] How to parse XML document with default namespace withJDOM XPath
- From: Philippe Poulard <philippe.poulard@sophia.inria.fr>
- To: Jack Bush <netbeansfan@yahoo.com.au>
- Date: Thu, 06 Nov 2008 10:10:00 +0100
hi,
another piece of code that might be useful:
<xcl:active-sheet
xmlns:xcl="http://ns.inria.org/active-tags/xcl"
xmlns:ns="http://www.w3.org/1999/xhtml">
<xcl:parse-html name="myHtml" source="file:///path/to/file.html"/>
<xcl:for-each name="myElem" select="{
$myHtml/ns:html/ns:body/ns:div[@id='container']/ns:div[@id='content']/ns:table[@class='sresults']/ns:tr/ns:td/ns:a
}">
<xcl:echo value="{ string( $myElem ) }"/>
<xcl:echo value="{ string( $myElem/@href ) }"/>
</xcl:for-each>
</xcl:active-sheet>
the underlying HTML parser is not tagsoup but nekohtml; if your source
file is XHTML and not HTML, you can simply use <xcl:parse> (an XML
parser) instead of <xcl:parse-html> (an HTML parser)
I don't really know what happens for a pure HTML source with the XHTML
namespace, but you can trace what have been parsed by inserting the
following operation, for example after the for-each statement:
<xcl:transform source="{ $myHtml }" output="{ $sys:out }"
xmlns:sys="http://ns.inria.org/active-tags/sys"/>
for pure HTML sources, nekohtml capitalize the elements to be conform to
the HTML spec (as stated in the nekohtml parser documentation), and it
must be the same in the XPath expression
to launch the script, just download reflex
(http://reflex.gforge.inria.fr/) and type in a console:
java -jar reflex-0.3.2.jar run active-sheet.xcl
--
Cordialement,
///
(. .)
--------ooO--(_)--Ooo--------
| Philippe Poulard |
-----------------------------
http://reflex.gforge.inria.fr/
Have the RefleX !
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]