Lists Home |
Date Index |
i think this is generally called "pull" DOM and i believe it was
implemented in Python first (!?).
in Java i wrote XPP2 (available since August 2001) that as part of
implementation has XmlPullNode that allows not only to build
incrementally tree but even to go back to underlying parser and return
to parsing of XML directly for parts of tree (or skipping parts of XML
that are not needed).
this is very flexible and powerful approach to process and dispatch XML
messages. as parsing _and_ tree building can start as soon as first XML
start tag is received and is easily monitored so application builds as
much of XML tree as needed it has very positive implications for
performance, see results for XPP pull when compared with other XML tree
APIs in Java:
i work now on XB1 that is direct successor of XPP2 XmlPullNode but has
easier to use API (XPP2 was very minimal even _ascetic_ API) and
models directly XML infoset in Java.
Elliotte Rusty Harold wrote:
> Streaming APIs like SAX and XMLPULL by their nature provide some of
> the content of a malformed document to the client application before
> the first well-formedness error is detected. The XML specification
> implicitly says this is OK, though in some use-cases roll-back or
> failure to commit may be desirable.
> Now consider the case of a tree-based API such as DOM, JDOM, or XOM
> which encounters a malformedness error. Traditionally, these APIs have
> reported no information from a malformed document to the client
> application. However, recently Laurent Bihanic submitted a patch to
> JDOM in which as much of the document as had been able to be
> successfully parsed was made available through the exception that was
> thrown to indicate the malformedness error. This was quite clever. It
> had never occurred to me, and I had never noticed any other API do
> anything similar.
> What I'd like to get broader discussion of is whether this is a good
> idea. There are certainly use cases for it. Bihanic wanted to read the
> envelope of an XML message even if the data was malformed. However,
> there are also problems. For instance, if the well-formedness error is
> a missing end-tag, then the element with the missing end-tag will
> still appear in the partial tree. And if the problem is a missing root
> element, then this may produce a Document object with no root element.
> On the other hand, rollback, failure to commit, or simply ignoring the
> malformed document is much easier than with a streaming API since you
> know in advance that the document is malformed.
> Is this approach something to be encouraged? Should other tree-based
> APIs like XOM and DOM copy this innovation? What advantages and
> disadvantages have I not thought of?
"Mr. Pauli, we in the audience are all agreed that your theory is crazy.
What divides us is whether it is crazy enough to be true." Niels H. D. Bohr