[
Lists Home |
Date Index |
Thread Index
]
- To: <xml-dev@lists.xml.org>
- Subject: regular expression question
- From: "Paul Hermans" <paul.hermans@amplexor.com>
- Date: Fri, 26 Aug 2005 11:16:11 +0200
- Thread-index: AcWqHs0HSRAin46QQJS90gGDOhGTeQ==
- Thread-topic: regular expression question
<xsd:simpleType name="emailType">
<xsd:restriction base="xsd:string">
<xsd:pattern
value="[\p{L}_-]+(\.[\p{L}_-]+)*@[\p{L}_]+(\.[\p{L}_]+)+"/>
</xsd:restriction>
</xsd:simpleType>
Following tools do not throw an error: XML Spy, Stylus Studio, Oxygen.
On the other hand: Saxon8SA and IPSI-XQ do.
If the definition is changed to
<xsd:simpleType name="emailType">
<xsd:restriction base="xsd:string">
<xsd:pattern
value="[\p{L}_\-]+(\.[\p{L}_\-]+)*@[\p{L}_]+(\.[\p{L}_]+)+"/>
</xsd:restriction>
</xsd:simpleType>
Saxon8SA and IPSI-XQ do not complain anymore.
I think the rationale is that the hyphen "-" has within the square
brackets (to define character classes) a special meaning and needs to be
escaped.
But to my surprise the same regular expression is accepted by a
dedicated regular expression engine (RegExBuddy), who clearly indicates
that it is the character itself we are after.
The rationale here could be that since no other character is following
the hyphen is not used for indication ranges in character classes, but
as itself.
Which interpretation is the correct one?
Thanks,
Paul
|