OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] regular expressions

[ Lists Home | Date Index | Thread Index ]

David Tolpin wrote:

 >>>    s-pattern="""
 >>>      comment = "\(([^\(\)\\]|\\.)*\)"
 >>>      atom = "[a-zA-Z0-9!#$%&'*+\-/=?\^_`{|}~]+"
 >>>      atoms = atom "(\." atom ")*"
 >>>      [...]
 >>>
 >>>Why isn't it done?
 >>
 >>
 >>HyLex used a similar syntax for regular expressions.
 >>I've always wondered why the idea never caught on elsewhere.
 >>(Then again, none of the ideas from HyTime ever really
 >>caught on...)
 >
 >
 > In fact, I've implemented it in an extension datatype library for my 
Relax
 > NG validator; it is only 70 lines of code in Scheme, after all. Proved
 > to be very useful for debugging.

Very clever. But a naive implementation would just recursively 
concatenate the strings to make a single regex strings. Could you 
elaborate on the debugging advantage, i.e., how it makes it easier for a 
schema writer to debug regular expressions?

Jeni Tennison used the same idea with a slightly different syntax in her 
DTLL proposal (I've lost the URL). Her idea had the added twist that an 
application could receive the results of the regular expression parse as 
a structured result, e.g., through a SAX API. Thus, using your example, 
the string "(David Tolpen)David.Tolpin@nospam.net" might produce the 
'infoset':

<start>
   <comment>(David Tolpen)</comment>
   <local-part>
     <atoms>
       <atom>David</atom>.<atom>Tolpin</atom>
     </atoms>
    </local-part>@<domain>
     <atoms>
       <atom>nospam</atom>.<atom>net</atom>
     </atoms>
    </domain>
</start>

This still seems a fruitful avenue to explore.

Bob Foster
http://xmlbuddy.com/





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS