[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] A better way to construct regular expressions in XMLSchemas?
- From: "Liam R. E. Quin" <liam@w3.org>
- To: "Costello, Roger L." <costello@mitre.org>, "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
- Date: Tue, 21 Nov 2017 14:51:42 -0500
On Mon, 2017-11-20 at 12:35 +0000, Costello, Roger L. wrote:
> Hi Folks,
>
> I have an XML Schema that needs some complex regular expressions. I
> have been using <!ENTITY> to construct the regexes. See below. I find
> it pretty hard to debug these regular expressions. Is there a better
> way to construct regexes?
I'd probably use fewer of them and i might test them in a standalone
program (as long as one doesn't use any of the very minor differences
between XSD and the more widely-used Perl expressions), e.g.
#! /usr/bin/perl -w
my $LanguageTag = "(${langtag}) | (${privateuse}) | ( ${grandfathered}
)";
and so on (in reverse order so everything is defined before it's used)
and then,
while (<>) {
chomp; # remove whitespace at end of imput line
if (m/${LanguageTag}/x) {
print "OK: $_";
} else {
print "unmatched: $_;
}
}
Then if things don't match that I expected, i'd try matching against
individual components in the expression.
If that sort of scripting isn't comfortable for you though,you could
make an XML document with lots of test cases, one for each branch of
the regex, but the validator might stop on the first error, which is
less than helpful at times :) so in that case use XSLT to split it into
lots of different test case documents perhaps.
You could also test with XSLT or XQuery or standalone XPath 2 or later,
with replace(), e.g. replace($input, $pattern, "[1=$1,2=$2,3=$3]") to
see which ()-group matched. Watch that XSD patterns have different
meanings for \-escaped "special" thingies like \i, \a etc.
Finally, be careful about baking in a specific version of a spec into
software if it's not necessary - consider a future revision of the RFC
that adds something your regexp doesn't match...
Liam
--
Liam Quin, W3C, http://www.w3.org/People/Quin/
Staff contact for Verifiable Claims WG, SVG WG, XQuery WG
Web slave for http://www.fromoldbooks.org/
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]