Lists Home |
Date Index |
On Wed, 2002-03-06 at 04:16, Rick Jelliffe wrote:
> So stream-based processing is efficient because it operates on a single
> traversal. Contrast this with DOM, if you implement "layered"
> functions naively as separate passes.
> But not all layers can be implemented well using streams. (XPath
> systems requiring arbitrary context, for example.) So instead, to
> get efficiency of tree-based data structures, we need to perform
> as many functions as possible during one traversal.
I think it makes sense to consider how best to implement particular
features based on whether or not they depend on content which comes
later in the document. I don't think this justifies moving toward a
single-pass traversal in general, however.
I'd like to see more atomic rules vocabularies which specify different
kinds of processing and their sequence. XML Pipeline Definition
Language is a good start, but I think the category "Schema Processing"
is too broad, for instance, because "Schema" currently encompasses far
too many different kinds of processing.
Ron Bourret's message noting the monolithic manner in which XML Schema
is written and the consequences thereof is a good explanation.
> So it makes sense for an implementation, for efficiency reasons,
> for a schema processor to do datatyping, augmentation, and
> defaulting at the same time.
I agree with Sean that this is premature optimization. I'd rather see a
schema standard specify the modular parts - which isn't true today - as
if they are to be performed separately, and permit implementations to
figure out conformant ways of making that work if a single pass is
> So the modularity of Schema languages should not only be seen
> in terms of "what functions can be split out into independent
> passes?", but rather "what functions can be split out into notional
> independent passes, but implemented using the same pass?"
I don't think it's worthwhile to consider how to combine the passes
during the design of the schema. Even notionally independent passes
would be an improvement on a monolith, however.
> In practice (i.e. for designers of schema languages and scissor-happy
> layerists), it means that for efficiency the node-selection mechanism
> should be shared, while the node manipulation mechanism should
> be modular.
Perhaps, if you want to think of all of this as operations on a tree.
I'm trying instead to blend events and trees in MOE, letting operations
collect information into trees as they need it and then releasing the
tree back into events if desired. Seems a more promising route to me,
Ring around the content, a pocket full of brackets
Errors, errors, all fall down!