[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: simpleType hh:mm:ss:ff
- From: Robin Cover <robin@isogen.com>
- To: Bob Kline <bkline@rksystems.com>
- Date: Mon, 22 Jan 2001 17:19:17 -0600 (CST)
I think a regex pattern for dates has been published on
the Net before, but I forget where... perhaps an online
paper of Michael Sperberg-McQueen. A posible hint:
Sperberg-McQueen, C. Michael. "Regular expressions for dates." [SQUIB]
Markup Languages: Theory & Practice 1/4 (Fall 1999) 20-26. ISSN:
1099-6622 [MIT Press]. Author's affiliation: World Wide Web
Consortium/MIT Laboratory for Computer Science; Email: cmsmcq@acm.org.
"[One may hear:] 'Validation of dates, therefore, must (so goes the
argument) be left to application-specific code.' While I agree that
date checking is probably not best done using SGML content models, I
feel compelled to point out that the claim just paraphrased, as
stated, is false. It is possible to write a regular expression which
recognizes dates. At the Markup Technologies '98 conference, I
exhibited a deterministic finite-state automaton for recognizing
Gregorian dates in the form yyyy-mm-dd (as specified by ISO 8601),
which can be represented as lex code thus... The editors challenge our
readers to find shorter expressions for recognizing dates, which
accepts all valid dates, and only valid dates; in particular, the
expression should accept a 29th day in February only in leap years,
using the Gregorian rules for leap years. In particular, we are
interested in (a) the shortest regular expression and (b) the shortest
such expression which is unambiguous in the sense of SGML (or,
synonymously, deterministic in the sense of the XML
specification). For concreteness, we will specify that the expression
should use the syntax of lex , and need only accept four-digit
years. Variant date formats specified by ISO 8601 need not be
accepted. The shortest correct expressions received by the editors
before 1 July 2000 will be published in a later issue of this
journal. Judgement of a panel of peer reviewers as to the correctness
and length of the submissions is final."
- Robin Cover
-------------------------------------------------
On Mon, 22 Jan 2001, Bob Kline wrote:
> On Mon, 22 Jan 2001, José Manuel Beas wrote:
>
> > Right. It works. But what if I wanted to define it deriving from
> > xsd:timeDuration.
> >
>
> No idea.
>
> > And if I wanted to define a date like "yyyymmdd"?
>
> It's conceivable that you could use a regular expression to capture the
> rules for which months have how many days (including leap years), but it
> would be a *very* hairy RE, and it would be restricted to only those
> centuries for which you enumerate the patterns. In fact, it would be
> such a complicated expression, that it's hard to imagine that it be
> worth it, just to get rid of the standard punctuation.
>
> Now, if you only want an approximate solution, you might be happy with
> something like
>
> "\d{4}[01]\d[0-3]\d"
>
> There's a continuum, of course. You could fiddle with this without too
> much pain and get a little more control (restricting mm to 01 through 12
> and dd to 01 through 31), but if you really want to nail down the right
> ranges for dd depending on what yyyymm is, you're in for some tedious
> work.
>
> --
> Bob Kline
> mailto:bkline@rksystems.com
> http://www.rksystems.com
>
>