xml-dev - Re: [xml-dev] What should TrAX look like? (Was: Re: [xml-dev] Article on

Re: [xml-dev] What should TrAX look like? (Was: Re: [xml-dev] Article on

[ Lists Home | Date Index | Thread Index ]

To: Elliotte Harold <elharo@metalab.unc.edu>
Subject: Re: [xml-dev] What should TrAX look like? (Was: Re: [xml-dev] Article on JAXP 1.3 "Fast and Easy XML Processing")
From: Chris Burdess <dog@bluezoo.org>
Date: Thu, 17 Feb 2005 13:41:39 +0000
Cc: XML Developers List <xml-dev@lists.xml.org>
In-reply-to: <421486A8.5050303@metalab.unc.edu>
References: <830178CE7378FC40BC6F1DDADCFDD1D1042DAC47@RED-MSG-31.redmond.corp.microsoft.com> <4213F5CE.1000102@propylon.com> <421486A8.5050303@metalab.unc.edu>

Elliotte Harold wrote:
> Here's a perhaps more useful question. Could we define an alternate 
> source interface that would allow validators, transformers, and query 
> tools to hook into arbitrary models? Specifically, could we define one 
> that would be complete, unlike Source; and would not require these 
> tools to provide special support for each different object model? What 
> would such an interface look like?

Perhaps there could be a consistent API that represents the input at 
various levels of "parsedness", that can effectively replace the SAX 
InputSource for SAX/StAX and/or DOM parsers, and provide more 
information for object graphs with or without PSVI annotations.

> Possibly the issues of transforms are different from query tools and 
> validators. All transform engines I've seen build their own internal 
> model. They do not work directly on top of DOM, SAX, XOM, or other 
> things.

The GNU JAXP transformer works directly with DOM Level 3 Core trees. 
Two new trees are generated during the transformation: a normalised 
version of the source tree, and the result tree. Both of these are DOM 
Level 3 Core.

> Validators and query tools, by contrast, tend not to construct new 
> object models and do work directly on top of the preexisting in-memory 
> representations of the XML document.

In many ways the issue is the same for validators: the process of 
validation takes as input a DOM tree and outputs an annotated DOM tree. 
Since there is no Node.setTypeInfo method, the validator must either 
construct a new tree or have a priori knowledge of the Node subclass 
and the means of associating the type information with it.

> Does this seem plausible? Does this seem worth doing? Does anyone have 
> any other ideas?

I believe that it would be worth doing, if possible.

1. The stream source must be able to provide an byte stream and entity 
metadata (SYSTEM and/or PUBLIC id). I believe it's a design error to 
provide a character stream: determination of the encoding should be 
made by the parser.

2. The tree source must be able to provide either:
a. an object implementing the Node interface (simple but DOM-specific), 
or
b. an object resembling a tree navigator that can be used to iterate 
over the nodes in the tree and retrieve individidual node objects (more 
complex but object model agnostic, should perhaps be combined with a 
property (a URI?) indicating the object model(s) supported).
-- 
Chris Burdess

References:
- RE: [xml-dev] Article on JAXP 1.3 "Fast and Easy XML Processing"
  - From: "Dare Obasanjo" <dareo@microsoft.com>
- Re: [xml-dev] Article on JAXP 1.3 "Fast and Easy XML Processing"
  - From: Bill de hÓra <bill.dehora@propylon.com>
- What should TrAX look like? (Was: Re: [xml-dev] Article on JAXP 1.3"Fast and Easy XML Processing")
  - From: Elliotte Harold <elharo@metalab.unc.edu>

Prev by Date: [ANN] XML Standards Library 2.1 : Updated 2005-02-17
Next by Date: Substitution Groups
Previous by thread: Re: [xml-dev] What should TrAX look like? (Was: Re: [xml-dev] Article on JAXP 1.3 "Fast and Easy XML Processing")
Next by thread: RE: [xml-dev] Article on JAXP 1.3 "Fast and Easy XML Processing"
Index(es):
- Date
- Thread