OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   regular expressions

[ Lists Home | Date Index | Thread Index ]

Some schema languages use string regular expressions to check lexical space of 
attributes and character data. The regex strings often become uncomprehensible,
such as

(([a-zA-Z][0-9a-zA-Z+\-\.]*:)?/{0,2}[0-9a-zA-Z;/?:@&=+$\.\-_!~*'()%]+)?(#[0-9a-zA-Z;/?:@&=+$\.\-_!~*'()%]+)?

for any URI. 

Providing a structured syntax, similar to that for XML, would help reading and debugging
them, for example,

    s-pattern="""
      comment = "\(([^\(\)\\]|\\.)*\)"
      atom = "[a-zA-Z0-9!#$%&'*+\-/=?\^_`{|}~]+"
      atoms = atom "(\." atom ")*"
      person = "\"([^\"\\]|\\.)*\""
      location = "\[([^\[\]\\]|\\.)*\]"
      local-part = "(" atoms "|" person ")"
      domain = "(" atoms "|" location ")"
      start = "(" comment " )?" local-part "@" domain "( " comment ")?"
    """

instead of 

    pattern=
      "(\(([^\(\)\\]|\\.)*\) )?"
    ~ """([a-zA-Z0-9!#$%&'*+\-/=?\^_`{|}~]+(\.[a-zA-Z0-9!#$%&'*+\-/=?\^_`{|}~]+)*|"([^"\\]|\\.)*")"""
    ~ "@" 
    ~ "([a-zA-Z0-9!#$%&'*+\-/=?\^_`{|}~]+(\.[a-zA-Z0-9!#$%&'*+\-/=?\^_`{|}~]+)*|\[([^\[\]\\]|\\.)*\])"
    ~ "( \(([^\(\)\\]|\\.)*\))?"

Why isn't it done?

David




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS