Lists Home |
Date Index |
[Bill de hÓra]
>And I don't understand this disdain for regular expressions over XML.
>Regexes are a perfectly useful tool for manipulating text.
I used regexp's myself - I'd say about 30% of the time when processing XML.
It makes me nervous
though and I try not to do it in any mission critical context.
The trouble comes in having a degree of confidence in the correctness of
For example, on the face of it using a regexp to catch occurences of:
is simple. Not so for a many reasons. Writing regexps capable of getting
in the full generality of XML 1.0 is tantamount to writing a full xml 1.0
The standard answer I get when I harp on about this is something
like "ah, but I know the XML I'm processing is machine generated and consistent
I always feel uneasy relying on the upstream XML supplier like this! It
degree of brittle coupling in systems that is best avoided if possible.
I can only see two routes to making XML regexping as safe as it is convenient:
1) Make a profile of XML 1.0 *syntax* that is regexp safe (permathread anyone?)
2) Use a post-parse syntax for regexp work like PYX notation