[
Lists Home |
Date Index |
Thread Index
]
<snip/>
> It appears that what is needed, is :
> -also 4 kind of sizes
> -a mean to read forward SAX events
> To achieve this, I intend to write a cache that could store some events
> (limited to 100 or 1000 or whatever you set as a default parameter) ;
> thus, when a size is requested, the engine goes on reading the input
> until the information is known, then the step is evaluated and later,
> the events stored will be fetched.
> This is a smart strategy because it is not limited to count(), but to
> any operations that expect more reading, thus a predicate that contains
> following-sibling:: may also be considered. The idea is to use the cache
> only when it is explicitely expected (putting all a document in a cache
> wouldn't be SAX, but DOM). The events could also be cached in a tree
> fragment, I don't know yet what is the best way to achieve this.
>
> Of course there are examples when the information expected is not
> reachable in the limit of the cache size, or lost because it has been
> previously read, but it will help in many other examples.
> As I have some code that allows to pour SAX events into DOM trees, I'll
> provide a smart mean to match a pattern on the SAX entry, and process
> the subtree with full XPath capabilities ; this might be very helpfull
> for very large documents.
>
> What do you think of such a strategy ?
> Did you made something similar in Saxon ?
<snip/>
You might want to take a look at XML Path Event API (XPEA) [1] by Karl
Waclawek. Also, there is some work in .NET about caching based XmlReader
implementations (I think Oleg or Dare had a blog post on this). In
general decoupling the caching allowed for a fair amount flexibility
down the road.
Cheers,
Jeff Rafter
[1]http://sourceforge.net/projects/xpea
|