XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] A better way to construct regular expressions in XMLSchemas?

On Mon, 2017-11-20 at 12:35 +0000, Costello, Roger L. wrote:
> Hi Folks,
> 
> I have an XML Schema that needs some complex regular expressions. I
> have been using <!ENTITY> to construct the regexes. See below. I find
> it pretty hard to debug these regular expressions. Is there a better
> way to construct regexes?
I'd probably use fewer of them and i might test them in a standalone
program (as long as one doesn't use any of the very minor differences
between XSD and the more widely-used Perl expressions), e.g.

#! /usr/bin/perl -w

my $LanguageTag = "(${langtag}) | (${privateuse}) | ( ${grandfathered}
)";

and so on (in reverse order so everything is defined before it's used)
and then,

while (<>) {
  chomp; # remove whitespace at end of imput line
  if (m/${LanguageTag}/x) {
    print "OK: $_";
  } else {
     print "unmatched: $_;
  }
}

Then if things don't match that I expected, i'd try matching against
individual components in the expression.

If that sort of scripting isn't comfortable for you though,you could
make an XML document with lots of test cases, one for each branch of
the regex, but the validator might stop on the first error, which is
less than helpful at times :) so in that case use XSLT to split it into
lots of different test case documents perhaps.

You could also test with XSLT or XQuery or standalone XPath 2 or later,
with replace(), e.g. replace($input, $pattern, "[1=$1,2=$2,3=$3]") to
see which ()-group matched. Watch that XSD patterns have different
meanings for \-escaped "special" thingies like \i, \a etc.

Finally, be careful about baking in a specific version of a spec into
software if it's not necessary - consider a future revision of the RFC
that adds something your regexp doesn't match...

Liam

-- 
Liam Quin, W3C, http://www.w3.org/People/Quin/
Staff contact for Verifiable Claims WG, SVG WG, XQuery WG

Web slave for http://www.fromoldbooks.org/


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS