[
Lists Home |
Date Index |
Thread Index
]
** Reply to message from Kal Ahmed <kal@techquila.com> on Tue, 29 Oct 2002
21:16:51 +0000
> How about coding to an abstract data model with greater expressive power than
> a simple XML schema (small s). Like a topic map or RDF abstraction for
> example ? (I'm only slightly joking here ;-)
Interesting suggestion, so let me give you a real-life example. I'm a
co-editor of MDDL, the Market Data Definition Language (www.mddl.org). MDDL
treats most of its elements as "properties" whose values can be inherited from
ancestors if they haven't been directly defined on a child element. This makes
the instance files much smaller where there is repeated information (like the
currency), but with >260 property elements, it makes the Schema impossible to
write by hand with any semblance of quality control.
So, for version 1.0 of MDDL, we (the editors) wrote a basic Schema by hand
without any inheritance or other shorthand tricks, and I wrote an XSLT
stylesheet to process that into the full Schema (as well as creating the nearest
equivalent DTD). This worked, but it wasn't completely manageable. There are
typically multiple ways to construct an XML Schema to achieve the instance
document format that you want. That becomes a nightmare when you want to
process the Schema automatically to enhance it, because of the need to cope for
likely variations. Worse, as time goes by, there is a real risk that new
editors will introduce new and unforseen kinds of variations in the Schema, and
then everything would fall in a heap.
What I needed was a way of imposing a particular style on the base Schema. You
could do this by writing code to check the style, but for a generic language of
any sort (schema, topic map, RDF, programming language, or anything else), this
is not a trivial task. Instead, I went back 20 years to when people advocated
the creation of small, custom programming languages for particular problem
areas (using lexx/yacc or flex/bison), so that developers were forced to focus
on the problem itself rather than on which of the many features of their bloated
generic language they would use. My solution, then, was to create a small,
tight, custom schema language in XML for the MDDL data model, one which only
allows just enough functionality to do the things needed for MDDL, and nothing
more. We have used this for MDDL 2.0, and it works! While the main MDDL Schema
is 600K, the data model is 30K, and I have been able to use the data model to
produce not only the Schema but a visual representation of MDDL
http://www.mddl.org/LMS_MDDL/index.html
which would have been all but impossible to produce from the final Schema.
Now, this is not a criticism of XML Schema nor Topic Maps, RDF, etc. XML
Schema is a good input format for Schema validation engines that help remove a
ton of verification code from applications consuming XML. However, the variety
of things that XML Schema must support in order to do that job means that it can
be hard to use it as a design format and maintain sufficient consistency of
design. Similarly, Topic Maps and RDF are good formats for information engines
to work with, but when authoring, the same questions of how to maintain
consistency arise. Until design tools allow that consistency of style and usage
to be imposed on Schemas, Topic Maps, RDF, & whatever, there is a lot to be said
for starting from a small, custom XML format, and then generating the various
generic formats from that. That said, I expect that Topic Maps will be a
similarly useful starting format once ISO finishes the Topic Map Constraint
Language and tools start supporting it.
Cheers,
Tony.
====
Anthony B. Coates, Information & Software Architect
mailto:abcoates@TheOffice.net
MDDL Editor (Market Data Definition Language)
http://www.mddl.org/
|