[
Lists Home |
Date Index |
Thread Index
]
> As someone who was until very recently "one of those implementers" I
> completely disagree with you. We had customers who want to process XML
> documents that hundreds of megabytes to gigabytes in size who can't
> afford to materialize even a fraction of these documents in certain
> cases.
Dare,
what exactly are you disagreeing with ?
This discussion is going in zig-zag. Did you read my postings ? Did I
ever tell
you that XQuery was the solution for **everything** !? I don't remember
saying that.
I was just reading this SAX/streaming/memory consumption discussion, and
being a person who actually designed and implemented such a streaming
XML
query processor, I had a terrible sensation of deja vu. There are solid
solutions
in the published and implemented state of the art already.
I was just curious to know if there are deep technical issues why
people have to
reinvent such techniques. I learned that there are cases where indeed
there is
no point in using preexisting XML processors, simply because they don't
apply,
and people have to do it by hand.
But I also learned that a lot of reinventing the wheel is also for fun.
I'm not gone
comment on that. Next time I take a plane I can only cross fingers that
the people who
designed the air control traffic system optimized for something
different then their
programmers's fun.
So I reiterate my point: there are well known techniques to maximize
streaming and
minimizing memory consumption. Many of them are already implemented in
existing
systems, and many will show up in the next versions of various
industrial strength
products.
In a big majority of the cases, people who need to process XML don't
need to understand
the gory details of buffer management. And they shouldn't. They should
concentrate only
on the logic of their application, and rely on good XSLT/XQuery
compilers and runtimes
to do the right job concerning the implementation strategy.
As for the well known techniques for minimizing memory consumption, I
am afraid that
I cannot point to any specific technique on this mailing list, for the
following reasons:
(a) it's too much literature to be discussed in such a forum
(b) a lot of it is folklore
(c) a lot of it is simply inherited from streaming and lazy evaluation
of SQL
query processors, using the iterator model. (Goetz Graefe can tell you
much more
about that then me, and he's closer to you), and you can imagine how
much
folklore is there too after 30 years
The best idea that comes to my mind is to encourage somebody to write a
survey of such techniques, that might be helpful.
My conclusion: please rely on good compilers, good optimizers and good
runtimes
instead of writing XML processors by hand if you don't *really* have to
(and few people
really have to). And trust the vendors/open source implementors that
they will produce
such good compilers, optimizers and runtimes when time comes.
As far as I am concerned, the horse is dead, I don't have much else to
add.
Best regards, have a wonderful holiday season,
Dana
|