Lists Home |
Date Index |
If I understand you correctly, you want to stream-parse a potentially huge
XML document so as not to put it all in memory, but you want to identify
certain branches to be tree-parsed and/or validated, etc?
If you're not averse to using Perl, the XML::Twig module is designed to do
exactly that, and easily.
The site (http://www.xmltwig.com/) seems to be down right now but there's a
tutorial here: http://www.xml.com/pub/a/2001/03/21/xmltwig.html and module
documentation here: http://search.cpan.org/~mirod/XML-Twig-3.15/Twig.pm
> -----Original Message-----
> From: Guillaume Lebleu [mailto:email@example.com]
> Sent: Wednesday, June 16, 2004 3:02 PM
> To: firstname.lastname@example.org
> Subject: [xml-dev] progressive parsing of XML. Any technology
> out there?
> Assuming I receive a very large XML document coming in
> through HTTP, for instance, a big SOAP message that embeds
> other XML documents and unstructured data in base64 encoded
> values, and let's say I want to validate information in my
> SOAP Header and return a response as fast as possible, but I
> want to process (esp. validate) the rest of my XML document
> What I would like to do is control the branches of my tree I
> want to parse and to what depth (hence the term "progressive
> parsing"), and end up with some objects that point to
> unparsed xml for the branches and depth I don't need for my
> preliminary processing.
> Of course, I want something that is pretty dynamic and does
> not require low level SAX coding on the application developer.
> (Right now, the only way out there is to use things like
> SOAP+Attachments, where you actually are not using XML so
> that there is an explicit separation between multiple XML
> documents that can be parsed separately, but this approach
> then makes a design when you have XML documents containing
> pointers to other documents in the MIME message received that
> needs to be resolved, etc.).
> Are there technologies out there to do this in a better way?
> The xml-dev list is sponsored by XML.org
> <http://www.xml.org>, an initiative of OASIS
The list archives are at http://lists.xml.org/archives/xml-dev/
To subscribe or unsubscribe from this list use the subscription