OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Question] How to do incremental parsing?

> -----Original Message-----
> From: Xu, Mousheng (SEA) [mailto:Mousheng.Xu@sea.celltechgroup.com]
> Sent: Tuesday, July 03, 2001 8:27 PM
> To: 'xml-dev@lists.xml.org'
> Subject: [Question] How to do incremental parsing?
> Dear all,
> A problem of all the current XML parsers is that they at least read the
> whole XML document into the input stream, which can consume a lot
> of memory
> when the XML is big (e.g. 1 GB).

Not true.  Use SAX parser.

> One way to get around the problem would be to read the XML file
> into memory
> gradually and when needed. I would like to build such a DOM
> parser, but I am

There is a modules that does the for Perl.  XML::Twig reads in only the
parts of the document which you specify and builds a DOM representation of
it.  So therefore you avoid loading the parts that you will not use.


> not familiar with the design of the Xerces XML parsers. Could someone give
> me a suggestion on how to tackle on the problem? The most critical part
> would be the method to parse an element. If reading the whole
> document into
> memory is inevitable, then I would like to borrow the method
> which parse the
> input stream to get the next element.
> Your help is highly appreciated.
> Thanks in advance.
> -- Mousheng Xu
> The information contained in this email is intended for the
> personal and confidential use of the addressee only. It may
> also be privileged information. If you are not the intended
> recipient then you are hereby notified that you have received
> this document in error and that any review, distribution or
> copying of this document is strictly prohibited. If you have
> received  this communication in error, please notify Celltech
> Group immediately on:
> +44 (0)1753 534655, or email 'is@celltech.co.uk'
> Celltech Group plc
> 216 Bath Road, Slough, SL1 4EN, Berkshire, UK
> Registered Office as above. Registered in England No. 2159282
> ------------------------------------------------------------------
> The xml-dev list is sponsored by XML.org, an initiative of OASIS
> <http://www.oasis-open.org>
> The list archives are at http://lists.xml.org/archives/xml-dev/
> To unsubscribe from this elist send a message with the single word
> "unsubscribe" in the body to: xml-dev-request@lists.xml.org