OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Generating typed code from DTDs, why not?

[ Lists Home | Date Index | Thread Index ]
  • From: Luke Gorrie <luke@javagroup.org>
  • To: xml-dev@ic.ac.uk
  • Date: 11 Mar 1999 04:49:12 +1000

Hi all,

I'm pretty new to XML, but as I've poked around I've observed what
seem to be some strange things.  XML parsers all seem to provide
interfaces which ignore the static structure information provided by
DTDs and rely on "one fits all" interfaces to elements, in stark
contrast to the conventions of statically typed languages.

For instance, the first thing I played with in XML was SAX using
Python.  I was impressed by how easily it worked and how naturally it
fit in with a dynamically typed language like python.  Then I had a
look at the Java interface and found that it was just the same, which
I thought very odd!  The natural mapping for SAX onto Java, to get the
(significant) benefits of static typing, would be to generate a
Visitor interface.  The Visitor interface would have a method for
"visiting" each type of element in the document, and the argument to
this method would be an object which presents the element contents
through typed accessor methods.  At least, that's how it looks to me.

In the case of DOM, again generating typed accessor code would provide
these great benefits.  People could use a DTD (or similar) as the
definition language for their abstract data types, and generate
DOM-compliant classes which they can both use "natively" in their
language and also manipulate as part of a genuine DOM tree at the same

It seem like these methods which ignore the wealth of static structure
information available will begin to show serious problems if they try
to scale to the features proposed in some specifications like SOX,
where more fine grained relationships and constraints can be

So, my question is: are there any efforts around working towards
creating mappings from DTD or other other XML type definition
languages to various programming languages (or to other IDLs like
OMG's), or is there some reason why this is considered a bad idea?

I'm excited by the possibility of using a visual modelling tool
(perhaps using an extension of the UML) to model document structure,
and from the model be able to generate a DTD, from which to generate
classes which give me access to the XML data in a natural way for
programming language.  I'm amazed that more people don't seem to share
this enthusiasm.  What we're doing with vanilla DOM and SAX interfaces
seems analogous to using CORBA IDL as documentation, and making all
object calls using the dynamic invocation interface!

P.S. I was told today that Oracle have recently done something similar to
this, which sounds great.  I look forward to taking a look, but I
can't help but wonder if there's a reason that it took this long - and
how much the Oracle product does.  If someone could point me to some
other products which do similar things, I'd be much obliged.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS