Lists Home |
Date Index |
Elliotte Harold wrote:
> Here's a perhaps more useful question. Could we define an alternate
> source interface that would allow validators, transformers, and query
> tools to hook into arbitrary models? Specifically, could we define one
> that would be complete, unlike Source; and would not require these
> tools to provide special support for each different object model? What
> would such an interface look like?
Perhaps there could be a consistent API that represents the input at
various levels of "parsedness", that can effectively replace the SAX
InputSource for SAX/StAX and/or DOM parsers, and provide more
information for object graphs with or without PSVI annotations.
> Possibly the issues of transforms are different from query tools and
> validators. All transform engines I've seen build their own internal
> model. They do not work directly on top of DOM, SAX, XOM, or other
The GNU JAXP transformer works directly with DOM Level 3 Core trees.
Two new trees are generated during the transformation: a normalised
version of the source tree, and the result tree. Both of these are DOM
Level 3 Core.
> Validators and query tools, by contrast, tend not to construct new
> object models and do work directly on top of the preexisting in-memory
> representations of the XML document.
In many ways the issue is the same for validators: the process of
validation takes as input a DOM tree and outputs an annotated DOM tree.
Since there is no Node.setTypeInfo method, the validator must either
construct a new tree or have a priori knowledge of the Node subclass
and the means of associating the type information with it.
> Does this seem plausible? Does this seem worth doing? Does anyone have
> any other ideas?
I believe that it would be worth doing, if possible.
1. The stream source must be able to provide an byte stream and entity
metadata (SYSTEM and/or PUBLIC id). I believe it's a design error to
provide a character stream: determination of the encoding should be
made by the parser.
2. The tree source must be able to provide either:
a. an object implementing the Node interface (simple but DOM-specific),
b. an object resembling a tree navigator that can be used to iterate
over the nodes in the tree and retrieve individidual node objects (more
complex but object model agnostic, should perhaps be combined with a
property (a URI?) indicating the object model(s) supported).