Lists Home |
Date Index |
Mike Champion wrote:
> Maybe I'm trying to reconcile my irreconciliable desires for loose coupling /
> document-centric XML applications on one hand and ease of use of XML
> technologies by ordinary programmers on the other.
You are not alone in those desires, nor are they in fact irreconcilable. What
is missing, so far as I can see, in all of these 'direct access to data
conveyed by XML' arguments is the necessary acknowledgment that every XML
instance must be parsed on every occasion of every use. That is a mechanical,
functional requirement which is intrinsic to the nature of XML. If it does not
require parsing (and therefore is not a document of UnicodeWithAngleBrackets),
then what you have is not in any meaningful sense XML and what you have lost is
every advantage (interoperability among them) for which you looked to XML
We have to build 'XML technologies' upon parsers, and therefore upon the
documents which those parsers parse, and not upon infosets because you can't
have an XML infoset until you instantiate one from a parse. Nothing in the
schism of dataheads from docheads is irreconcilable given only this one
premise, which is the ultimate premise of XML itself. Infosets produced from
anywhere other than upon the output of the parse of an XML document are not XML
infosets. Take away that premise and you take away XML; it's that simple.
Therefore the 'XML technologies' which we build--which may very well (should,
in fact!) include direct access to data instantiated in particular forms--must
be built first upon parser output. This means that what is built by a
particular, code-specific or data-specific technology on any one occasion will
not be the same as what is built from a parse of the same document on another
occasion or by another technology. That is, there is no predictable 'infoset'
to be derived from a document, and no one infoset inheres in any document.
Acknowledge that, and dataheads and docheads can peacefully and profitably make
use of the same XML entities. Deny that and there simply is no identifiable