OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Re: If XML is too hard for a programmer, perhaps he'dbe b

[ Lists Home | Date Index | Thread Index ]

Sean McGrath wrote:
> [Bill de hÓra]
>  >And I don't understand this disdain for regular expressions over XML.
>  >Regexes are a perfectly useful tool for manipulating text.
> 
> Hi Bill,
> 
> I used regexp's myself - I'd say about 30% of the time when processing 
> XML. It makes me nervous
> though and I try not to do it in any mission critical context.
> 
> The trouble comes in having a degree of confidence in the correctness of 
> the regexps.

I think we're agreeing, but I'm looking at it backways- you'd want 
to know what you're looking for is regular and not say, context 
free, rather than hope the regex doesn't consume on false positives 
and negatives. If you know it's not regular or just don't care to 
know, that's willful engagement in incompetence.


> The standard answer I get when I harp on about this is something
> like "ah, but I know the XML I'm processing is machine generated and 
> consistent therefore...".

For regexing XML though I'm really talking about little 
admin/console jobs and sed scripts over the likes of config files 
rather than something sitting in front of a data stream (where Son 
of Regex, XPath, can do nicely).

One tempting exception might be for templating languages with what 
you might call 'magic tags' that get expanded. So instead of using

  $Revision

you end up with:

  <magic:Revision/>

This works so long as you produce and consume, with nothing in the 
middle (often the case with templating). Once another system is 
inserted and does this:

  <magic:Revision>
  </magic:Revision>

you're stuffed, or quickly refactoring to

  <magic:Revision value="$Revision"/>.

But if you hate attributes that's ok, there is an industrial 
strength, time-honoured option. If in the grand tradition we simply 
add,

  <!-- =============================================
         (DPH 2003-04-01) This Can't Happen:
  	<magic:Revision>
         </magic:Revision>
   =============================================  -->
  <magic:Revision/>

we're all set... ;)

Bill






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS