[
Lists Home |
Date Index |
Thread Index
]
- From: tpassin@home.com
- To: "Roger L. Costello" <costello@mitre.org>, xml-dev@lists.xml.org,xerces-j-dev@xml.apache.org
- Date: Sun, 03 Sep 2000 18:35:54 -0400
Roger L. Costello asked -
> Consider this regular expression:
>
> (.)+\.(gif|jpg|jpeg|bmp)
>
> As I interpret this regular expression it says, "one or more occurrences
> of any character, followed by a dot, followed by either gif or jpg or
> jpeg or bmp". Correct?
>
> Here's my question - why is it that two schema validators (Oracle and
> xerces 1.2) both accept the following strings:
>
> images\mighty_oj.gif
> images\omega.jpg
> images\wheateena.jpg
>
> but reject these strings:
>
> images\champion.gif
> images\greenPower.jpg
> images\juiceman.jpg
>
I'm not sure if it's the same in xml schemas, but try this version:
(.+)\.(gif|jpg|jpeg|bmp)
In python, the original expression matches the last character of the name,
as well as returning the extension (since it is in parentheses). The new
version matches correctly - at least, in python:
import re
patre=r".+\.(gif|jpg|jpeg|bmp)"
pat=re.compile(patre)
print pat.findall(r'images\champion.gif')
>>> [('n', 'gif')]
# New version
patre=r"(.+)\.(gif|jpg|jpeg|bmp)"
pat=re.compile(patre)
print pat.findall(r'images\champion.gif')
>>>[('images\\champion', 'gif')]
Cheers,
Tom Passin
|