[
Lists Home |
Date Index |
Thread Index
]
1/11/2002 5:51:13 PM, Jonathan Robie
<jonathan.robie@softwareag.com> wrote:
>
> Some work remains to be done here. First,
> I have not represented word boundaries or
> punctuation here. ...
Well, Osama/Usama has apparently set back XQuery by
a few hours, and I can understand why Michael Rhys
thinks it's time to put this thread out of its
misery. But I think that at least the sub-thread
about validating limericks raises some fairly
interesting issues.
At one point today, I was thinking "jeez, if a pure
XML validator can't even identify a "properly
scanning" limerick without extensive markup, just
what IS it good for?" Limericks are indeed a fairly
trivial use case for XML, but many of the same
"validation" issues discussed here such as
cleverness, neologisms, weird rhyming scheme, etc.
are no worse than the non-structural validation
criteria that apply to purchase orders or whatever:
A purchase order must not only have the right tags
and types in the right places, it must be from an
established customer with sufficient credit to pay,
and the products ordered must not only be in the
catalog but in inventory as well, and the shipping
address must be a real place within our shipping
area, and all sorts of other things that no XML type
system could ever cover.
But then I remembered John Cowan's original
complaint -- the thread may be full of inspired
doggerel, but not very many actual "limericks."
Intelligent people can be quite clever at coming up
with interesting neologisms, bizarre rhymes, and
cryptic flames, but they don't seem to be able to
count the damn syllables!!!! So, even if the
limerick validator merely did a rough check to
reject mechanical violations of the limerick
"schema", it would be doing some good. So -- if we
lived in a world where properly-scanning limericks
had some economic value and there were various
limerick processors that understood instances of
some limerick schema -- I could imagine using an XML
validator to catch the truly stupid mistakes rather
than wasting the human judges time on them.
ON THE OTHER HAND, if someone is investing in some
automated (presumably heuristic or AI-based)
"judge" of the quality of the limericks (or the
business value of the purchase order), it's not
clear that the XML validation step adds anything of
value. Sure, it's simple (once that DTD is
finished, get to it, Jonathan!), but what value does
the XML validation phase add? I'm still of the
opinion that the RE parse and dictionary lookup
*procedure* would be a lot easier to code than to
develop the DTD/schema, so why not just put that
quick-screen code at the front end of the more
complex "validation" process? Counter-arguments
that I can think of include a) the declarative
description of the constraint system is more
provably correct than the equivalent procedural code
to check the constraints; b) the declarative
validation can be done with well-tested off-the-
shelf tools rather than buggy one-off code [but the
schema could be buggy!], c) ordinary blokes using
wizzy Schema editors could more easily produce the
schema than they could produce the equivalent code
... These all seem a bit contrived to me!
So, what do people think ... have we just stumbled
on a couple of non-typical examples (limericks and
purchase orders) where "type" validation is a fairly
trivial subset of what a "real" validator would do?
Are XML validations mainly useful when they are
relatively easily "programmed" and can be put into a
human-oriented evaluation process with lots of
chances for over-ride of an inappropriate rejection?
Is it REALLY going to be easier for inhabitants of
the Real World to develop useful declarative schemas
than useful procedural code? Educate me!
|