[
Lists Home |
Date Index |
Thread Index
]
There's been surprisingly little discussion of Sean's original question:
>Is the strong/weak/runtime typing argument over XML any different from
>that debate in programming languages.
I do think that this argument is different for XML. In programming
languages, it is rare to argue whether integers should be allowed to exist
in data, or whether the serialization should be considered the true form of
the data. Most programming languages that allow complex structures also
have fairly strong typing - though it may well be runtime typing rather
than static typing.
Personally, I believe that XML documents contain a wide range of type
information, from very loosely typed information without even a DTD, to
rigidly structured data corresponding to relational data or objects. A
language designed for processing XML needs to be able to deal with this
fact gracefully, avoid imposing assumptions on what is allowed that
conflict with what is actually found in the data, allow data to be managed
without turning explicit casts into a common idiom, and allow the
programmer to focus on the documents being processed and the task to be
performed rather than the quirks of the type system.
Some people seem to feel that only weakly typed systems meet those
requirements. I disagree. I think that a language whose type system matches
exactly the types found in XML documents will be most graceful for
processing XML. If the XML is governed only by a DTD or has no schema, the
appropriate types are document, element, attribute, node, text node,
processing instruction, comment, ID, IDREF, IDREFS, etc. A strongly typed
language that does not support these types tends to get in the way, because
it is a poor match for the data being processed. It insists on the wrong
things.
If W3C XML Schema is used, then documents can also contain the kinds of
types typically found in strongly typed programming languages, plus some
types typically not found anywhere. Again, the most graceful type system
for a programming language is the one that best matches the data being
processed. And even in this case, the type system must be very flexible,
because the output types may be quite different from the input types.
Strong typing is not synonymous with needing to write the names of types
everywhere and do lots of explicit casts, though this is common in many
strongly typed languages. Implicit typing allows a programming language to
infer a type, assigning the correct type to a variable without requiring
the programmer to write down the name of the type.
The ideal is to come up with a type system that catches errors early
without imposing too much overhead on programmers. This is difficult and
subtle, and you have to both look at it theoretically and by looking at
lots and lots of examples. But I think that XML has become important enough
to deserve such a type system.
Jonathan
|