OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: XML Schemas: Regular Expression Question

[ Lists Home | Date Index | Thread Index ]
  • From: tpassin@home.com
  • To: "Roger L. Costello" <costello@mitre.org>, xml-dev@lists.xml.org,xerces-j-dev@xml.apache.org
  • Date: Sun, 03 Sep 2000 18:35:54 -0400

 Roger L. Costello asked -

> Consider this regular expression:
>
> (.)+\.(gif|jpg|jpeg|bmp)
>
> As I interpret this regular expression it says, "one or more occurrences
> of any character, followed by a dot, followed by either gif or jpg or
> jpeg or bmp".  Correct?
>
> Here's my question - why is it that two schema validators (Oracle and
> xerces 1.2) both accept the following strings:
>
> images\mighty_oj.gif
> images\omega.jpg
> images\wheateena.jpg
>
> but reject these strings:
>
> images\champion.gif
> images\greenPower.jpg
> images\juiceman.jpg
>
I'm not sure if it's the same in xml schemas, but try this version:

(.+)\.(gif|jpg|jpeg|bmp)

In python, the original expression matches the last character of the name,
as well as returning the extension (since it is in parentheses).  The new
version matches correctly - at least, in python:

import re
patre=r".+\.(gif|jpg|jpeg|bmp)"
pat=re.compile(patre)

print pat.findall(r'images\champion.gif')
>>> [('n', 'gif')]

# New version
patre=r"(.+)\.(gif|jpg|jpeg|bmp)"
 pat=re.compile(patre)

print pat.findall(r'images\champion.gif')
>>>[('images\\champion', 'gif')]

Cheers,

Tom Passin





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS