[
Lists Home |
Date Index |
Thread Index
]
I wrote:
> When one writes:
>
> f:foldl-tree($f:add, $f:add(), 0, /*)
Must be:
When one writes:
f:foldl-tree(f:add(), f:add(), 0, /*)
Dimitre.
"Dimitre Novatchev" <dnovatchev@yahoo.com> wrote in message
cqsvbp$fbi$1@sea.gmane.org">news:cqsvbp$fbi$1@sea.gmane.org...
> Why I think Daniela Florescu is right?
>
> Please, bear with my style, which has nothing to do with SAX and any kind
> of APIs mentioned in this thread. Just read on, I promise you'll agree
> that my message is relevant.
>
> This is the code of the f:foldl-tree() function, which is part of the FXSL
> library:
>
> <xsl:stylesheet version="2.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> xmlns:f="http://fxsl.sf.net/"
> xmlns:int="http://fxsl.sf.net/int/folfl-tree"
> exclude-result-prefixes="f int"
>>
> <xsl:import href="func-apply.xsl"/>
>
> <xsl:function name="f:foldl-tree">
> <xsl:param name="pFuncNode" as="element()"/>
> <xsl:param name="pFuncSubtrees" as="element()"/>
> <xsl:param name="pA0"/>
> <xsl:param name="pNode" as="element()"/>
>
> <xsl:choose>
> <xsl:when test="not($pNode)">
> <xsl:copy-of select="$pA0"/>
> </xsl:when>
> <xsl:otherwise>
> <xsl:variable name="vSubtrees" select="$pNode/*"/>
>
> <xsl:sequence select=
> "f:apply($pFuncNode,
> $pNode/@tree-nodeLabel,
> int:foldl-tree_($pFuncNode, $pFuncSubtrees, $pA0,
> $vSubtrees)
> )"
> />
> </xsl:otherwise>
> </xsl:choose>
> </xsl:function>
>
> <xsl:function name="int:foldl-tree_">
> <xsl:param name="pFuncNode" as="element()"/>
> <xsl:param name="pFuncSubtrees" as="element()"/>
> <xsl:param name="pA0"/>
> <xsl:param name="pSubTrees" as="element()*"/>
>
> <xsl:sequence select=
> "if(empty($pSubTrees))
> then $pA0
> else
> f:apply($pFuncSubtrees,
> f:foldl-tree($pFuncNode, $pFuncSubtrees, $pA0,
> $pSubTrees[1]),
> int:foldl-tree_($pFuncNode, $pFuncSubtrees, $pA0,
> $pSubTrees[position() > 1])
> )"
> />
> </xsl:function>
> </xsl:stylesheet>
>
> In a few words, this is a generic fold() but over a tree (not just over a
> list). As such, it needs two functions to be provided as parameters -- one
> for processing the current node and one for processing all subtrees of the
> current node.
>
> When one writes:
>
> f:foldl-tree($f:add, $f:add(), 0, /*)
>
> the result of evaluating this is the sum of the values of all
> @tree-nodeLabel attributes of all nodes in the tree.
>
> If I pass as parameters other functions, I'll perform other processing on
> a (any!) tree.
>
> So, in case of XSLT/XQuery processing, we pass the necessary two functions
> as parameters to f:foldl-tree() and we have implemented an XSLT/XQuery
> processor.
>
> Why is this all relevant to the current discussion?
>
> Because: a fold() processing of any kind is essentially streaming.
>
> Therefore, let.s just provide the required two functions and not worry how
> the function engine does streaming -- there could be reasonably efficient
> implementations. The most obvious example is a lazy implementation -- no
> subtrees are ever processed unless ultimately required.
>
> What is more, in a lazy implementation the source tree can itself be
> evaluated lazily -- only those nodes/subtrees will need to be parsed,
> which are ultimately required.
>
> Just as a side note -- streaming a tree implies linearization -- this may
> go against efficiency when opposed to parallelization (e.g. using a DVC
> (divide and conquer) approach), which is the ultimate strength of
> functional languages and will start to matter more and more as explained
> in the paper "The Free Lunch Is Over: A Fundamental Turn Toward
> Concurrency in Software"
> (http://www.gotw.ca/publications/concurrency-ddj.htm) by Herb Sutter.
>
> Parallelization may require that different threads share the same data,
> which will delay the possibility to discard this data from memory.
>
>
> Cheers,
>
> Dimitre Novatchev.
>
>
> "Daniela Florescu" <dflorescu@mac.com> wrote in message
> 30291DBF-590E-11D9-A33A-000393DC762C@mac.com">news:30291DBF-590E-11D9-A33A-000393DC762C@mac.com...
>>> As someone who was until very recently "one of those implementers" I
>>> completely disagree with you. We had customers who want to process XML
>>> documents that hundreds of megabytes to gigabytes in size who can't
>>> afford to materialize even a fraction of these documents in certain
>>> cases.
>>
>>
>> Dare,
>>
>> what exactly are you disagreeing with ?
>>
>> This discussion is going in zig-zag. Did you read my postings ? Did I
>> ever tell
>> you that XQuery was the solution for **everything** !? I don't remember
>> saying that.
>>
>> I was just reading this SAX/streaming/memory consumption discussion, and
>> being a person who actually designed and implemented such a streaming XML
>> query processor, I had a terrible sensation of deja vu. There are solid
>> solutions
>> in the published and implemented state of the art already.
>>
>> I was just curious to know if there are deep technical issues why people
>> have to
>> reinvent such techniques. I learned that there are cases where indeed
>> there is
>> no point in using preexisting XML processors, simply because they don't
>> apply,
>> and people have to do it by hand.
>>
>> But I also learned that a lot of reinventing the wheel is also for fun.
>> I'm not gone
>> comment on that. Next time I take a plane I can only cross fingers that
>> the people who
>> designed the air control traffic system optimized for something different
>> then their
>> programmers's fun.
>>
>> So I reiterate my point: there are well known techniques to maximize
>> streaming and
>> minimizing memory consumption. Many of them are already implemented in
>> existing
>> systems, and many will show up in the next versions of various industrial
>> strength
>> products.
>>
>> In a big majority of the cases, people who need to process XML don't need
>> to understand
>> the gory details of buffer management. And they shouldn't. They should
>> concentrate only
>> on the logic of their application, and rely on good XSLT/XQuery
>> compilers and runtimes
>> to do the right job concerning the implementation strategy.
>>
>> As for the well known techniques for minimizing memory consumption, I am
>> afraid that
>> I cannot point to any specific technique on this mailing list, for the
>> following reasons:
>>
>> (a) it's too much literature to be discussed in such a forum
>> (b) a lot of it is folklore
>> (c) a lot of it is simply inherited from streaming and lazy evaluation of
>> SQL
>> query processors, using the iterator model. (Goetz Graefe can tell you
>> much more
>> about that then me, and he's closer to you), and you can imagine how much
>> folklore is there too after 30 years
>>
>> The best idea that comes to my mind is to encourage somebody to write a
>> survey of such techniques, that might be helpful.
>>
>> My conclusion: please rely on good compilers, good optimizers and good
>> runtimes
>> instead of writing XML processors by hand if you don't *really* have to
>> (and few people
>> really have to). And trust the vendors/open source implementors that they
>> will produce
>> such good compilers, optimizers and runtimes when time comes.
>>
>> As far as I am concerned, the horse is dead, I don't have much else to
>> add.
>>
>> Best regards, have a wonderful holiday season,
>> Dana
>>
>>
>>
>>
>> -----------------------------------------------------------------
>> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>> initiative of OASIS <http://www.oasis-open.org>
>>
>> The list archives are at http://lists.xml.org/archives/xml-dev/
>>
>> To subscribe or unsubscribe from this list use the subscription
>> manager: <http://www.oasis-open.org/mlmanage/index.php>
>>
>>
>
>
>
>
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://www.oasis-open.org/mlmanage/index.php>
>
>
|