[
Lists Home |
Date Index |
Thread Index
]
Jeff Lowery wrote:
>
> Today I've been staring at an industry spec whose XML data model is not
> definable by any schema language that I know of. Validation is currently
> done by what I call "little gray boxes": open-source executables that
> validate the XML document in nice, lengthy, idiosyncratic C++.
>
> Not being long in XML tooth, I wonder if any of you older hounds might care
> to comment on how commonly such is case in the big, wide, metalanguage
> world.
I don't know how common it is, but here's another data point:
In one of our applications we have something like the following:
<!ATTLIST field
name CDATA #REQUIRED
units %a.units; #IMPLIED
min %a.value; #IMPLIED
max %a.value; #IMPLIED
default %a.value; #IMPLIED
>
%a.units; and %a.value; are defined as "CDATA" in the DTD;
the intent is that %a.units; is a unit specification like
"meters/sec^2", and %a.value; is a floating point number
followed by a unit specification.
So far so good: although these constraints aren't expressible
in a DTD, units and values can be described by a regular grammar
so HyLex, W3C XML schema, or any other schema language with
regular expressions would work.
The tough constraint is that the 'units', 'min', 'max', and
'default' attributes must be dimensionally compatible.
So for example
<field
name = "OMEGA"
units = "radian/sec"
min = "0 deg/sec"
max = "1000 rpm"
default = "10 hz"
>
would be legal, but 'units = "meter/sec", default="5 feet"'
would not be.
I don't know of any schema language that can enforce this
constraint (and if there were one, I doubt that I'd want
to use it :-). We use a combination of DTDs, Schematron,
and Tcl for validation.
The Tcl code isn't terribly lengthy or idiosyncratic though.
We use a Cost specification to look up a regular expression,
enumerated list, Tcl expression, or Tcl procedure for each
attribute and for element content. The validation driver
and lookup table are very straightforward (although some
of the individual validation procedures have gotten hairy).
The "schema", such as it is, looks a bit like this:
specification verifySpec {
{el} {
refid.proc checkCrossref
filename.regex {[a-z][a-z0-9]+(/[a-z][a-z0-9]+)*\.[a-z]+}
minoccur.test {string is integer}
maxoccur.test {string is integer}
}
{element field} {
units.proc checkUnits
min.proc checkValue
max.proc checkValue
default.proc checkValue
}
}
proc checkCrossref {refid} {
if {[query# doctree el withattval "id" $refid] == 0} {
error "cross-reference to nonexistant ID <$refid>"
}
}
proc checkUnits {units} { ... }
proc checkValue {value} { ... this bit gets ugly }
--Joe English
jenglish@flightlab.com
|