[
Lists Home |
Date Index |
Thread Index
]
[Bill de hÓra]
>And I don't understand this disdain for regular expressions over XML.
>Regexes are a perfectly useful tool for manipulating text.
Hi Bill,
I used regexp's myself - I'd say about 30% of the time when processing XML.
It makes me nervous
though and I try not to do it in any mission critical context.
The trouble comes in having a degree of confidence in the correctness of
the regexps.
For example, on the face of it using a regexp to catch occurences of:
<name>Sean</name>
is simple. Not so for a many reasons. Writing regexps capable of getting
this right
in the full generality of XML 1.0 is tantamount to writing a full xml 1.0
WF parser.
The standard answer I get when I harp on about this is something
like "ah, but I know the XML I'm processing is machine generated and consistent
therefore...".
I always feel uneasy relying on the upstream XML supplier like this! It
introduces a
degree of brittle coupling in systems that is best avoided if possible.
I can only see two routes to making XML regexping as safe as it is convenient:
1) Make a profile of XML 1.0 *syntax* that is regexp safe (permathread anyone?)
2) Use a post-parse syntax for regexp work like PYX notation
regards,
Sean
http://seanmcgrath.blogspot.com
|