OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Some clarificatiosn -- RE: [Question] How to do incremental parsing?

* Mousheng Xu
| The problem of SAX is that you will have to write all those tedious
| "startElement", "endElement" stuff 

Getting used to SAX takes a little readjustment of the brain (a
process I personally tend to enjoy), but once you do it is really easy
to write SAX applications. A couple of little utilities also make life
a lot easier.

Other than that it is really no different from writing a DOM
application, where you have to call all those tedious stuff to do what
you want. :-)

| and the parsing never stops!

Throw a SAXException and parsing will stop immediately.

| Some mentioned the row processing feature of dom4j, kXML, SAXON,
| minidom, easydom, and Orchard. Do they read the whole doc into
| memory before parsing anyway, like the DOM lazy eval? 

No. That would entirely defeat these purpose of these processing
modes. Given that you're working with Java I would start by looking at
dom4j and SAXON (in that order).

| If these parsers are based on xerces SAX, the chances are the whole
| doc is read into the memory.

SAXON can use any SAX parser; it uses Ælfred by default. 

I think dom4j is also parser-independent.
| * What is "persistent DOM"?

A DOM that accesses a structure stored in a persistent store, rather
than an in-memory object structure.

--Lars M.