OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] progressive parsing of XML. Any technology out there?

[ Lists Home | Date Index | Thread Index ]

If I understand you correctly, you want to stream-parse a potentially huge
XML document so as not to put it all in memory, but you want to identify
certain branches to be tree-parsed and/or validated, etc?

If you're not averse to using Perl, the XML::Twig module is designed to do
exactly that, and easily.

The site (http://www.xmltwig.com/) seems to be down right now but there's a
tutorial here: http://www.xml.com/pub/a/2001/03/21/xmltwig.html and module
documentation here: http://search.cpan.org/~mirod/XML-Twig-3.15/Twig.pm

- Mark.

> -----Original Message-----
> From: Guillaume Lebleu [mailto:gl@brixlogic.com] 
> Sent: Wednesday, June 16, 2004 3:02 PM
> To: xml-dev@lists.xml.org
> Subject: [xml-dev] progressive parsing of XML. Any technology 
> out there?
> 
> 
> Hello,
> 
> Assuming I receive a very large XML document coming in 
> through HTTP, for instance, a big SOAP message that embeds 
> other XML documents and unstructured data in base64 encoded 
> values, and let's say I want to validate information in my 
> SOAP Header and return a response as fast as possible, but I 
> want to process (esp. validate) the rest of my XML document 
> asynchronously.
> 
> What I would like to do is control the branches of my tree I 
> want to parse and to what depth (hence the term "progressive 
> parsing"), and end up with some objects that point to 
> unparsed xml for the branches and depth I don't need for my 
> preliminary processing.
> 
> Of course, I want something that is pretty dynamic and does 
> not require low level SAX coding on the application developer.
> 
> (Right now, the only way out there is to use things like 
> SOAP+Attachments, where you actually are not using XML so 
> that there is an explicit separation between multiple XML 
> documents that can be parsed separately, but this approach 
> then makes a design when you have XML documents containing 
> pointers to other documents in the MIME message received that 
> needs to be resolved, etc.).
> 
> Are there technologies out there to do this in a better way?
> 
> Thanks
> 
> Guillaume
> 
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org 
> <http://www.xml.org>, an initiative of OASIS 
<http://www.oasis-open.org>

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this list use the subscription
manager: <http://www.oasis-open.org/mlmanage/index.php>





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS