[
Lists Home |
Date Index |
Thread Index
]
On Wed, Aug 17, 2005 at 06:53:41PM +0100, Michael Kay wrote:
>> If the document falls out of scope then both XSLT 1 and 2 allow
>> an implementation to discard it. I don't think we'll see a
>> procedural way to discard a document otherwise, except as
>> part of something like the XQuery update facility perhaps.
> In practice it's quite difficult to discard the document automatically. The
> spec offers two guarantees:
>
> (a) if the same document (URI) is loaded again, you'll get the same node
> identifiers
>
> (b) if the same document (URI) is loaded again, it will have the same
> content
>
> It would be possible to discard the document and achieve (a) by remembering
> the node identifiers and reusing them if needed.
Yes.
> Achieving (b) though is really hard, given that the URI might in the
> worst case identify a random number generator. The only real way to do
> it is to serialize a private copy of the document to disk.
You could also behave differently depending on the URI scheme --
an extension to say "trust http expiry times and that the stylesheet
will take no more than 3 hours to run :-) and trust that input files
won't change on disk" might be interesting.
> The real problem though is in deciding when it's a good idea to discard the
> document. For example, if the stylesheet is working its way through the
> @href links from the primary source document, what's the chance that you'll
> want to visit the same target document more than once?
Are there some special cases that are big wins in prctice?
E.g. consider:
<xsl:template match="foo">
<!--* load a 500MByte XML file: *-->
<xsl:variable name="oed" select="doc('oed.xml')" />
<!--* do stuff with the dcument *-->
<xsl:element name="word-of-the-day">
<xsl:copy-of select="/dictionary/a/entry[@id = 'ascii'] />
</xsl:element>
</xsl:template>
if you don't know how often the template matches I can see that you
might want to cache the whole document in memory, but you have a
couple of other choices --
(1) save the result of the template -- in this case it doesn't depend on
anything other than the input document, and I've seen this usage
often, e.g. to get a document title
(2) drop the document if you get low on memory
This case is very clear, but I don't know at what point it stops
being optimiseable, and I'm sure you've thought about it a lot more
than I have! :-)
> That's why I decided
> that in this case having a user function to tell me when the document is no
> longer needed is rather more useful.
I think it's a good compromise, but I agree with you it'd be hard
to get consensus to add that to XPath F&O.
Liam
--
Liam Quin, W3C XML Activity Lead, http://www.w3.org/People/Quin/
http://www.holoweb.net/~liam/
|