[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [Question] How to do incremental parsing?
- From: "Sterin, Ilya" <Isterin@ciber.com>
- To: "Xu, Mousheng (SEA)" <Mousheng.Xu@sea.celltechgroup.com>,email@example.com
- Date: Wed, 04 Jul 2001 13:31:53 -0400
> -----Original Message-----
> From: Xu, Mousheng (SEA) [mailto:Mousheng.Xu@sea.celltechgroup.com]
> Sent: Tuesday, July 03, 2001 8:27 PM
> To: 'firstname.lastname@example.org'
> Subject: [Question] How to do incremental parsing?
> Dear all,
> A problem of all the current XML parsers is that they at least read the
> whole XML document into the input stream, which can consume a lot
> of memory
> when the XML is big (e.g. 1 GB).
Not true. Use SAX parser.
> One way to get around the problem would be to read the XML file
> into memory
> gradually and when needed. I would like to build such a DOM
> parser, but I am
There is a modules that does the for Perl. XML::Twig reads in only the
parts of the document which you specify and builds a DOM representation of
it. So therefore you avoid loading the parts that you will not use.
> not familiar with the design of the Xerces XML parsers. Could someone give
> me a suggestion on how to tackle on the problem? The most critical part
> would be the method to parse an element. If reading the whole
> document into
> memory is inevitable, then I would like to borrow the method
> which parse the
> input stream to get the next element.
> Your help is highly appreciated.
> Thanks in advance.
> -- Mousheng Xu
> The information contained in this email is intended for the
> personal and confidential use of the addressee only. It may
> also be privileged information. If you are not the intended
> recipient then you are hereby notified that you have received
> this document in error and that any review, distribution or
> copying of this document is strictly prohibited. If you have
> received this communication in error, please notify Celltech
> Group immediately on:
> +44 (0)1753 534655, or email 'email@example.com'
> Celltech Group plc
> 216 Bath Road, Slough, SL1 4EN, Berkshire, UK
> Registered Office as above. Registered in England No. 2159282
> The xml-dev list is sponsored by XML.org, an initiative of OASIS
> The list archives are at http://lists.xml.org/archives/xml-dev/
> To unsubscribe from this elist send a message with the single word
> "unsubscribe" in the body to: firstname.lastname@example.org