OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] regular expression on xml

On Tue, 2010-01-26 at 15:47 -0500, ycao5@scs.carleton.ca wrote:

>       In my xml application, I want to write a small parser which can 
> include useful parts in an xml document but ignore the rest. The  
> overhead of sax/dom parser is large. So is it reasonable to use  
> regular expressions to parse xml?

A sax parser should not give you a large overhead, as it doesn't
build a tree (you do that yourself) and doesn't use much memory.

You can use regular expressions (and XML was defined with this in
mind) as long as you know there are no CDATA sections... otherwise
it tends to get too hairy too quickly to be useful.  (I am assuming
pcre/java style extended regular expressions of course).  Commented-out
markup can cause problems too.

But it is likely to be faster to use a proper native C XML parser.

A parser that builds a DOM is likely overkill if you are only using
a small fraction of the document.


Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Ankh: irc.sorcery.net irc.gnome.org www.advogato.org

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS