[
Lists Home |
Date Index |
Thread Index
]
Tim Bray wrote:
> I have a good solution but it's probably not generally applicable - I
> interchange information with XML but I usually mash it as quickly as
> possible into application native data structures, which have their own APIs
> aimed at their own needs.
>
> I have no problem saying in public that XML is really really good for
> interchange and really really irritating for in-memory manipulation. I
> think we all ought to be more up-front about this.
For several years now I have advocated this approach as the basis of an XML
processing model. Logically implemented, however, it contradicts what is in
practice a central dogma of XML orthodoxy. The question is whether an
XML-consuming application ought to be built around the data structures as
interchanged or should rather use a private local data structure best suited
to the application's own implementation of some particular expertise. This is
actually a fundamental choice, which does not permit the answer 'do both', if
that means to use an agreed interchange data structure for data acquisition
and then use a more appropriate structure for the application internals.
Validation, whether by DTD, schema, or otherwise, is grounded in the
expectation that an XML-consuming application adheres to a contract to process
only input which conforms to a pre-agreed schematic. This is a legacy SGML
notion that has never successfully translated to the very different
environment 'on the Web'. XML 1.0 recognized this problem and indeed offered
the best solution to it in simple well-formedness. WFness acknowledges the
autonomy of any processing node in the internetwork topology of the Web by
recognizing that such a node will make its own decisions about what to
process, and how, regardless of the semantics apparently intended, or hoped to
be conveyed, by the particular structure to which a document instance might
conform. The real problem is actually the belief, whether explicitly
acknowledged or not, that moving from simple well-formedness to
validity--which is to say, casting an instance document in a particular
structure--adds semantics and, what is more, semantics of consequence to an
XML-consuming application receiving that document.
There is a fundamental divide between believing in one case that the first
priority of an XML-consuming application is to adhere to, and to enforce, the
schematic contract which is perceived as the basis of document
interchangeability, and believing in the other case that the point of an
application is to apply specific expertise in process, which includes
particular, and probably private, judgments about what input data to accept,
and generally about where and to get the data that the application's expertise
requires. Transaction protocols and remote procedure invocations depend
ultimately on the premise that presenting a particular data structure to an
application will cause it to execute a particular process. That assumption
depends on the homogeneity of an enterprise network where application nodes
understand each other's processes intimately, and is utterly invalid in the
internetwork topology of the Web. The autonomous processing nodes of the
internetwork topology are of value because of what they produce, which is to
say the expertise which they implement. But they must also be designed around
that expertise, one consequence of which is that they apply their private
expertise to their own data acquisition and data instantiation. By design,
such applications cannot be invoked by the presentation of a particular data
structure if by 'invoked' we mean be made somehow to respect the intent or
otherwise accept the semantics of whoever would invoke them. This is why the
answer to the question of whether an application uses a public or a private
data structure cannot be 'do both'.
Respectfully,
Walter Perry
|