[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: About Infosets

From: Didier PH Martin <martind@netfolder.com>
To: Sean McGrath <sean@digitome.com>, Xml-Dev <xml-dev@xml.org>
Date: Sun, 04 Mar 2001 09:00:08 -0500

Hi Sean,

[Sean answer... Its OK we have what's needed]

Didier replies:
Not exactly and this is why I am happy with the current thread Charles
started "We need an XPath API". I totally agree with Charles, we need an
XPath API and furthermore we need an xpath/xpointer api. Why?

Simply because, in fact, what's happening is that on one hand you have a SAX
parser that can be used by an XSLT processor. But it is the XSLT processor
that builds the infoset (with its own internal model). This implies that the
info set can only be accessed by the XSLT processor. So the point is: forget
about the pipe - everybody is playing in its own courtyard and do not share
a common data structure. You have a pipe only if processors can access the
data structure. For instance, here is a real pipe:

XML doc---------------->XSLT doc
   |                       |
   |                       |
Converted into          templates are applied
an info set with        on the infoset created
a parser                by the parser

I could as well insert a new processing to be applied on the infoset between
the parser converting the text into an infoset and the XSLT processor. For
instance an xinclude processor. Thus, the document is at first transformed
into an infoset and this same infoset can be accessed by the xinclude
processor (which mean the the infoset is modified by the xinclude
processor), later on, the modified infoset is accessed by an XSLT processor
(that creates a new infoset). We can also imagine other processor applying
new processes on the infosets. We have now a data flow processing model in
hands

The whole point here is that without a way to access the nodes with an API
(based on an addressing schema like XSLT is actually doing) it is hardly a
processing pipeline, just a wet dream :-) And if you still remember your
teen years, you know how wet dreams can be frustrating when you wake up :-))
Seriously, infosets need to be more than simple abstractions and we already
learned that with the groves. In the OpenJade project we learned that, in
fact, when the text is de-serialized we build a mini-hierarchical database
and it is this hirarchical database that is accessed by the dsssl engine. It
is not necessary to standardize the internal structure of this hierarchical
database, but more its interface. The processing modules get what they need
through the interface. We have created query languages like xpath and
xpointer that allows you to obtain, from an internal structure, a node list.
As you know node list processing is at the hart of lisp (hoops...)... xml
processing. Thus, if the internal structure can be accessed with a common
API we can therefore build processing pipes accessing a common data
structure. If we compare xpath/xpointer to s-expressions that we got in the
SGML world, the xpath/xpointer expression are, to a certain point, a lot
more compact and intuitive  (because they replicate the file path
convention - or resemble a lot to file path - therefore we can transfer our
knowledge gained with file path to xpath/xpointer expression- this is not
the case with s-expressions).

Conclusion: until we have a real API to access nodes with xpath/xpointer, we
will still do wet dreams, but nothing concrete to mix and match the
processing modules. We are close, but not there yet. The full power of XML
processing will happen when the W3C WG will agree on a common API to access
nodes and if this API uses xpath/xpointer expressions (and if xquery is
built on top of xpath/xpointer and then become a superset of
xpath/xpointer). It may not be possible to have a language independant API
but at least to get modules written in the same language to share this
common API.

So we are very close but not there yet. processing modules have to get a
common way to interact with the data structure. Said differently, we have
now to move away from parsing(syntax) to processing (do somehing with).
Define how modules can access and share a common data structure (the
hierarchical database created by the de-serialization obtained from the
parsing process).

cheers
Didier PH Martin
----------------------------------------------
Email: martind@netfolder.com
Conferences: xml devcon 2000 (http://www.xmldevcon2000.com)
		 Wireless Summit NY (http:www.pulver.com)
	       xml devcon 2001 London (http://www.xmldevcon2000.com)
Book: XML Professional (http://www.wrox.com)
column: xml.com (http://www.xml.com)

Follow-Ups:
- Re: About Infosets
  - From: "W. E. Perry" <wperry@fiduciary.com>

References:
- Re: About Infosets
  - From: Sean McGrath <sean@digitome.com>

Prev by Date: Re: We need an XPath API
Next by Date: RE: We need an XPath API
Previous by thread: Re: About Infosets
Next by thread: Re: About Infosets
Index(es):
- Date
- Thread