xml-dev - Re: [xml-dev] Re: Streaming XML (WAS: More on taming SAX (was Re:[xml-de

Re: [xml-dev] Re: Streaming XML (WAS: More on taming SAX (was Re:[xml-de

[ Lists Home | Date Index | Thread Index ]

To: Dimitre Novatchev <dnovatchev@yahoo.com>
Subject: Re: [xml-dev] Re: Streaming XML (WAS: More on taming SAX (was Re:[xml-dev] ANN: Amara XML Toolkit 0.9.0))
From: Uche Ogbuji <Uche.Ogbuji@fourthought.com>
Date: Thu, 30 Dec 2004 14:19:08 -0700
Cc: xml-dev@lists.xml.org
In-reply-to: <cqsvbp$fbi$1@sea.gmane.org>
Organization: Fourthought, Inc.
References: <830178CE7378FC40BC6F1DDADCFDD1D10276723C@RED-MSG-31.redmond.corp.microsoft.com> <30291DBF-590E-11D9-A33A-000393DC762C@mac.com> <cqsvbp$fbi$1@sea.gmane.org>

On Wed, 2004-12-29 at 11:56 +1100, Dimitre Novatchev wrote: 
> Why I think Daniela Florescu is right?

Please forgive me for suggesting that based on what I've read her write
in the thread, I'm not sure she herself would recognize in your post
just how it is you think she's right :-)

[Snip stuff along the lines of:] 
> If I pass as parameters other functions, I'll perform other processing on a 
> (any!) tree.

Sure.  And I can do this using similar functional techniques in Python.
My understanding of HaXML is that it makes the like even more elegant
that anything any of us have posted in Python, XQuery, XSLT, Java, etc.,
but that's just hearsay.

My point is that I think the ideas of declarativity, divide-and-conquer,
and data-driven processing are agreed upon by most in this thread.
Florescu seems to like to claim XQuery is the only means to such ends.
I know you like XSLT, especially with your functional library in tow.
My point is: to each his own.  I just don't buy the "drop everything
you're doing and write it all in XQuery first".  You've provided me
nothing to support that position.

> Therefore, let.s just provide the required two functions and not worry how 
> the function engine does streaming -- there could be reasonably efficient 
> implementations. The most obvious example is a lazy implementation -- no 
> subtrees are ever processed unless ultimately required.

In my toolkit, there are tools for when it's OK to leave such details to
the framework, and tools for when the programmer needs to open the hood.
This is based on observation of actual need.  many others in this thread
have corroborated that with concrete examples.

> Just as a side note -- streaming a tree implies linearization -- this may go 
> against efficiency when opposed to parallelization (e.g. using a DVC (divide 
> and conquer) approach)

Oh.  No.  You're wrong here.  This is what I like to call the "slander
on von Neumann's good name".  The fact that a most of today's computer
processing can be abstracted into a strict sequence of instructions,
inputs and outputs does not mean that in the actual execution this
sequence has to be executed serially.  At the lowest level, cf.
pipelining in modern microprocessors.  At the level we're discussing in
this thread, there is no reason why a stream of events cannot be divided
into concurrent streamlets.  SAX says nothing that prevents temporally
simultaneous events.  And at the gee-whiz futuristic level, von
Neumann's own Universal Constructors suggest possibilities beyond all
our current imaginations.

> Parallelization may require that different threads share the same data, 
> which will delay the possibility to discard this data from memory.

For parallelization to be practical in most cases, there will be an
essential isolation of the state's processing from the state's storage
management (good old logical layer versus physical payer).

The fact that I use XPatterns to govern the state table does not affect
this matter in any way.

-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html
Full XML Indexes with Gnosis - http://www.xml.com/pub/a/2004/12/08/py-xml.html
Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286
UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html
Use Universal Feed Parser to tame RSS - http://www.ibm.com/developerworks/xml/library/x-tipufp.html
Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html
A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/
The State of Python-XML in 2004 - http://www.xml.com/pub/a/2004/10/13/py-xml.html

Follow-Ups:
- Re: [xml-dev] Re: Streaming XML (WAS: More on taming SAX (was Re: [xml-dev] ANN: Amara XML Toolkit 0.9.0))
  - From: Miles Sabin <miles@milessabin.com>

References:
- RE: [xml-dev] Streaming XML (WAS: More on taming SAX (was Re: [xml-dev] ANN: Amara XML Toolkit 0.9.0))
  - From: "Dare Obasanjo" <dareo@microsoft.com>
- Re: [xml-dev] Streaming XML (WAS: More on taming SAX (was Re: [xml-dev] ANN: Amara XML Toolkit 0.9.0))
  - From: Daniela Florescu <dflorescu@mac.com>
- Re: Streaming XML (WAS: More on taming SAX (was Re: [xml-dev] ANN: Amara XML Toolkit 0.9.0))
  - From: "Dimitre Novatchev" <dnovatchev@yahoo.com>

Prev by Date: XSLT2 - which parts solve real 1.0 problems, which makes coffee? - was Re: [xml-dev] Streaming XML
Next by Date: Re: [xml-dev] Streaming XML (WAS: More on taming SAX (was Re: [xml-dev]ANN: Amara XML Toolkit 0.9.0))
Previous by thread: Re: Streaming XML (WAS: More on taming SAX (was Re: [xml-dev] ANN: Amara XML Toolkit 0.9.0))
Next by thread: Re: [xml-dev] Re: Streaming XML (WAS: More on taming SAX (was Re: [xml-dev] ANN: Amara XML Toolkit 0.9.0))
Index(es):
- Date
- Thread