[
Lists Home |
Date Index |
Thread Index
]
At 2:24 PM +0200 6/27/03, Miguel A. Robles wrote:
Dear colleagues,
I usually work with XML for sending on information between
different servers or applications. For example, currently I'm
working with web services and everithing works fine. The
problem appears now, because I have to parse a document
containing a lot of information. DOM is not enough
to accomplish the object because the document
is extremly large, and I don't know how SAX deals with this
kind of files.
I know that XML is not intended for containing so much
information, but I have to think about a possible solution.
What do you think I could act to fix the problem?
You could use XOM: http://www.cafeconleche.org/XOM/
The latest version provides a streaming, tree-based approach that
allows you to work with pieces of the tree in sequence and then
discard them. For many record-like documents this is much more
convenient than SAX while still using only slightly more memory than
the underlying SAX parser. Indeed, it can process arbitrarily large
documents with effectively constant memory usage. And unlike SAX the
document is fully read-write. Unlike some similar approaches this
doesn't require you to learn XPath to preidentify the nodes of
interest.
There are limits. This won't help if you really do need to have the
entire document in memory at once, or if you need to move about
randomly in the tree. However, if you can work with a peephole view
of the document stream, then XOM gives you a larger peephole than
SAX. It can show you whole elements of your choice rather than
individual tokens. I've found this approach to be shockingly useful
for many applications. In many cases elements are the right
granularity for processing a document. DOM's document level view is
too large. SAX's token level view is too small. XOM's element level
view is just right. :-)
--
Elliotte Rusty Harold
elharo@metalab.unc.edu
Processing XML with Java (Addison-Wesley, 2002)
http://www.cafeconleche.org/books/xmljava
http://www.amazon.com/exec/obidos/ISBN%3D0201771861/cafeaulaitA
|