[
Lists Home |
Date Index |
Thread Index
]
- From: "Laurent Bossavit" <laurent@mmania.com>
- To: xml-dev@xml.org
- Date: Mon, 13 Mar 2000 17:08:59 +0100
Ken MacLeod wrote:
> * A parser can [should?] maintain a partial DOM tree, at least
> parents, that would allow other XML functions to be used. For
> example, using XPath to perform matching.
I have been doing that, trying to implement 'higher-level' event
dispatching from a SAX event stream to a listener which defines what
data it is interested in in the form of XPath expressions.
The API goes roughly as follows (simplified for illustration) -
public interface XPathListener {
public abstract void handleData(Node[] nodes);
}
public interface XPathFilter {
public abstract void addListener(XPathListener l, String match);
public abstract void process(Parser p,InputSource i);
}
Client code which wants to retrieve some data from an XML stream
registers a node set expression identifying 'data of interest', and
only this data will be returned.
Assume the following XML data (partial document)
<stream>
<data type="int">1</data>
<data type="str">x</data>
<data type="str">y</data>
.../...
One would register interest in the value of 'data' elements with
'str' types using the following code :
XPathFilter xpf;
xpf.addListener(this,"data[@type='str']/text()");
xpf.process(somesaxparser,someinputsource);
the above data would result in two handleData() calls being made,
once for each text node of a data element with type 'str'. This is
much cleaner than the alternative - keeping track of state
information in an object's startElement()/characters()/endElement()
methods - especially if the element tree is deeper than a couple
levels.
Naturally, not all XPath features 'work' over SAX - e.g. following-*
axes or position() calls, depending on how much of the DOM tree you
are willing to build as you go along. I'm fairly sure though that
with suitable restrictions this would be a worthwhile addition to the
XML developer's arsenal, because XPath expressions are a concise way
of identifying only the parts of an XML data stream that your program
is interested in - without hand-coding specific automata every time.
If you are parsing whole documents, an XPath matcher on top of the
DOM will do fine - but this does not work if you need to parse an
incoming XML-formatted data stream and process data as it becomes
available, and the class of application I'm working on (real-time
chat using an XML-formatted protocol) requires that.
I have a quick-and-dirty, proof-of-concept implementation which
works, kind of - the 'best' way of delivering data-of-interest to
client code is not obvious (whether to use a DOM-compliant Node class
or something more lightweight, whether to use arrays or more complex
collections), and the XPath expression parser is unbelievably crude -
mostly because in current XPath implementations the parsing code
cannot be easily separated from code that relies on the DOM.
If anyone is working on something similar, or has suggestions on API
or implementation, I'm interested in your comments.
========================================
Laurent Bossavit - Ingénieur R&D
>>> laurent@mmania.com <<<
>> ICQ#39281367 <<
MultiMania http://www.multimania.fr/
========================================
***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************
|