OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: parsing XML using regular expressions

[ Lists Home | Date Index | Thread Index ]
  • From: tpassin@home.com
  • To: XML-Dev Mailing list <xml-dev@xml.org>
  • Date: Wed, 09 Aug 2000 13:37:01 -0400

Simon St.Laurent asks

> We have occasional battles here about the wisdom of using
> non-XML-parser-based tools to process XML, and regular expressions always
> seem to come up.
>
> I've got a reader question that sort of ties into a (non-regex-based) Java
> project I'm working on, about the viability of regex and other text-based
> processing for XML work.
>
> Has anyone written a generic XML parser, even a somewhat broken one,
that's
> built on regular expressions?  I remember hearing of something a long
while
> ago, but I can't find it.
>
> I'm not concerned with the efficiency/viability/profitability/wisdom of
> such a solution, just whether or not it's been done - especially if it's
> available open source.
>

Yes, the Python parser in the standard module "xmllib" uses regular
expressions.  It's not validating but it will process the DTD and insert
default values, and all that good stuff.  Python is at
http://www.python.org .

There's also the "shallow parsing" regex expressions at
ftp://fas.sfu.ca/pub/cs/TR/1998/CMPT1998-17.html:

REX: XML Shallow Parsing with Regular Expressions
Robert D. Cameron
School of Computing Science
Simon Fraser University

This work provides Python,perl,Javascript, and lex/flex regexes for XML.  To
quote from the site:

"... this paper documents a set of XML shallow parsing expressions that can
be used a basis for simple, correct, efficient, robust and
language-independent XML shallow parsing. Complete shallow parser
implementations of less than 50 lines each in Perl, JavaScript and Lex/Flex
are also given."

It includes an on-line demo.  Regexes can be very useful, especially if they
are going to be processing your own xml, where you know something about its
structure beforehand.

Regards,

Tom Passin





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS