OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Question] How to do incremental parsing?




Hi:

We currently incrementally read XML document using MSXML SAX parser. You'll
have to use Suspend() and Resume() methods of the IMXReaderControl
interface. However, the MSDN documentation warns "Be aware that this
interface is non-standard"

Hope this helps.

Thanks,

Ajay


-----Original Message-----
From: xml-dev-errors@lists.xml.org
[mailto:xml-dev-errors@lists.xml.org]On Behalf Of Sterin, Ilya
Sent: Wednesday, July 04, 2001 11:02 PM
To: Xu, Mousheng (SEA); xml-dev@lists.xml.org
Subject: RE: [Question] How to do incremental parsing?




> -----Original Message-----
> From: Xu, Mousheng (SEA) [mailto:Mousheng.Xu@sea.celltechgroup.com]
> Sent: Tuesday, July 03, 2001 8:27 PM
> To: 'xml-dev@lists.xml.org'
> Subject: [Question] How to do incremental parsing?
>
>
> Dear all,
>
> A problem of all the current XML parsers is that they at least read the
> whole XML document into the input stream, which can consume a lot
> of memory
> when the XML is big (e.g. 1 GB).

Not true.  Use SAX parser.


>
> One way to get around the problem would be to read the XML file
> into memory
> gradually and when needed. I would like to build such a DOM
> parser, but I am

There is a modules that does the for Perl.  XML::Twig reads in only the
parts of the document which you specify and builds a DOM representation of
it.  So therefore you avoid loading the parts that you will not use.

Ilya



> not familiar with the design of the Xerces XML parsers. Could someone give
> me a suggestion on how to tackle on the problem? The most critical part
> would be the method to parse an element. If reading the whole
> document into
> memory is inevitable, then I would like to borrow the method
> which parse the
> input stream to get the next element.
>
> Your help is highly appreciated.
>
> Thanks in advance.
>
> -- Mousheng Xu
>
>
> The information contained in this email is intended for the
> personal and confidential use of the addressee only. It may
> also be privileged information. If you are not the intended
> recipient then you are hereby notified that you have received
> this document in error and that any review, distribution or
> copying of this document is strictly prohibited. If you have
> received  this communication in error, please notify Celltech
> Group immediately on:
>
> +44 (0)1753 534655, or email 'is@celltech.co.uk'
>
> Celltech Group plc
> 216 Bath Road, Slough, SL1 4EN, Berkshire, UK
>
> Registered Office as above. Registered in England No. 2159282
>
> ------------------------------------------------------------------
> The xml-dev list is sponsored by XML.org, an initiative of OASIS
> <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To unsubscribe from this elist send a message with the single word
> "unsubscribe" in the body to: xml-dev-request@lists.xml.org

------------------------------------------------------------------
The xml-dev list is sponsored by XML.org, an initiative of OASIS
<http://www.oasis-open.org>

The list archives are at http://lists.xml.org/archives/xml-dev/

To unsubscribe from this elist send a message with the single word
"unsubscribe" in the body to: xml-dev-request@lists.xml.org