OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

About Infosets


Even if I am very busy to put the final touches on the talvastudio.com site,
I still have time to lurk the list and think about what is said. Something
caught my attention: Infosets.

I am glad Henry presented the underlying model behind XML. To what he said,
let me add this:

Food for though:

a) an info set can also be perceived as a hierarchical database
b) A particular node, being a document model abstraction may benefit to be
also something else. For instance, an <invoice> element is an element _and_
an invoice entity. Thus, if we find ways to have a node to inherits the
document abstraction (i.e. a syntax entity) and _also be_ a semantic entity
(an invoice entity is the real semantic object, the element is tied to the
syntax, taken as a node abstraction we can say that the element's hierarchy
and therefore the node's position within the hierarchy supports the
composite pattern (see gamma and al. Design Patterns) - the fact that the
document structure is based on the composite pattern is sufficient but not
necessary to say that we have here the basis for a hierarchical database

So like we said, points B leads to the fact that an infoset is a potential
hierarchical database. Therefore an XML document is a serialized
hierarchical database fragment.

Current limitations of the infoset:

Until we have a way to query an infoset with an API forget about the
processing pipeline. Off course until we have at least a recommended one,not
a standard one :-), . For instance, the coupling between a parser
transforming a serialized format (i.e. an xml document) into an infoset
(i.e. a mini hierarchical database) and an XSLT engine may be through
something like the Microsoft's selectNodes function (Doctor recommendation:
if you think Microsoft is the evil empire just imagine that your mother or
your best friend invented this useful API :-). This function allows to
extract a node list from an info set. If you pay close attention to an XSLT
engine's needs, you'll notice that this latter starve for node lists
extracted from an infoset (using Xpath addressing).

So, what is needed to help access an infoset is an API allowing to extract
nodelists with xpath expressions. I though DOM3 would do that but I have
seen nor heard nothing said about it in the presented specifications (from
W3C). Why? Don't we got 3 versions to learn what is useful and necessary?
Maybe some political good reasons behind this decision. But there are also
some good practical reasons to include a query function to the current DOM
API. Until then, forget about the dream of having pipelined infosets and so
on and so forth.

So, in fact, the whole processing pipeline also means that:
a) a parser can be used to build the hierarchical database (i.e. the
b) processors may query elements from this hierarchical database to perform
some processing like for instance, create a new hierarchical database by
extracting nodelist, performing some processing on these nodelists. By
modifying the info set structure or any other useful process.

In some ways, a processing pipeline resemble data flow engines in which data
structures move from a processor to an other, each processor altering the
data structure (but in the case of an XSLT engine it creates a new altered
copy of the data structure)

Should I ask for a wake up call at the W3C hotel's front desk for the DOM
group? :-)

Didier PH Martin
Email: martind@netfolder.com
Conferences: xml devcon 2000 (http://www.xmldevcon2000.com)
		 Wireless Summit NY (http:www.pulver.com)
	       xml devcon 2001 London (http://www.xmldevcon2000.com)
		 wireless one: (http://www.wirelessonecon.com/lasvegas/index.asp)
Book: XML Professional (http://www.wrox.com)
column: xml.com (http://www.xml.com)