Lists Home |
Date Index |
- From: firstname.lastname@example.org
- To: XML-Dev Mailing list <email@example.com>
- Date: Wed, 09 Aug 2000 13:37:01 -0400
Simon St.Laurent asks
> We have occasional battles here about the wisdom of using
> non-XML-parser-based tools to process XML, and regular expressions always
> seem to come up.
> I've got a reader question that sort of ties into a (non-regex-based) Java
> project I'm working on, about the viability of regex and other text-based
> processing for XML work.
> Has anyone written a generic XML parser, even a somewhat broken one,
> built on regular expressions? I remember hearing of something a long
> ago, but I can't find it.
> I'm not concerned with the efficiency/viability/profitability/wisdom of
> such a solution, just whether or not it's been done - especially if it's
> available open source.
Yes, the Python parser in the standard module "xmllib" uses regular
expressions. It's not validating but it will process the DTD and insert
default values, and all that good stuff. Python is at
There's also the "shallow parsing" regex expressions at
REX: XML Shallow Parsing with Regular Expressions
Robert D. Cameron
School of Computing Science
Simon Fraser University
quote from the site:
"... this paper documents a set of XML shallow parsing expressions that can
be used a basis for simple, correct, efficient, robust and
language-independent XML shallow parsing. Complete shallow parser
are also given."
It includes an on-line demo. Regexes can be very useful, especially if they
are going to be processing your own xml, where you know something about its