[
Lists Home |
Date Index |
Thread Index
]
- To: Peter Hunsberger <peter.hunsberger@gmail.com>
- Subject: Re: [xml-dev] XML Performance in a Transacation and recursion
- From: Rick Marshall <rjm@zenucom.com>
- Date: Mon, 03 Apr 2006 10:00:20 +1000
- Cc: Michael Kay <mike@saxonica.com>, xml-dev@lists.xml.org, daniel@veillard.com
- In-reply-to: <cc159a4a0603270636x6ca4a96cv12390632272531e0@mail.gmail.com>
- Organization: Zenucom Pty Ltd
- References: <200603260900.k2Q902ET031627@zmail.zenucom.com> <44269384.3060308@zenucom.com> <cc159a4a0603270636x6ca4a96cv12390632272531e0@mail.gmail.com>
- User-agent: Mozilla Thunderbird 1.0.7-1.1.fc4 (X11/20050929)
i've found the performance problem, and it ties together with the
discussion on recursion.
here's the problem - the stylesheet vocabulary is used to write a
postscript program. but postscript strings can't contain '(' or ')' (the
delimiters for a postscript string).
so every output string has to be parsed and the parentheses escaped with
a '\'.
we have no control over the source of the documents so i tried the
recursive examples for substituting one string with another through a
string. eg "ABC(DEF" ends up as "ABC\(DEF" and "AB(DEF)GHI()JK(" ends up
as "AB\(DEF\)GHI\(\)JK\(".
i've solved my performance problem for the moment by preprocessing the
input with sed instead.
but i'm not happy because sed has no knowledge of the dom and blindly
applies the transformation, instead of only applying it to the content
of elements.
so here's a real challenge - write a template for the above
transformation with an example on how to call it; i'll put it into the
style sheet and test it against the examples and we'll find out what
techniques are linear or better and what ones aren't.
the solution (in this case) must work with xsltproc.
currently on a 410k input document 41 of the 43 seconds of processing
time is taken up by the string escaping function.
writers of xsl processors can then compare their performance results
over the various techniques as well.
regards
rick
Peter Hunsberger wrote:
>On 3/26/06, Rick Marshall <rjm@zenucom.com> wrote:
>
>
>>Michael Kay wrote:
>>
>>
>>
>>>>o(n2) is what you get when something is wrong.
>>>>
>>>>
>>>>
>>>No, there are many problems for which no solution exists that is better than
>>>O(n^2) - in any language.
>>>
>>>
>>>
>>>
>>and it's a problem, because those problems can't be scaled, but instead
>>have to be tackled by decomposition (at least you then get linear response)
>>
>>
>>
>
>Think about what you're saying here: if you can break the data apart
>by hand and get linear processing times you've demonstrated that
>there is an algorithm that has linear processing times. The issue is
>coding it up...
>
>--
>Peter Hunsberger
>
>
>!DSPAM:44280618321021679284069!
>
>
>
begin:vcard
fn:Rick Marshall
n:Marshall;Rick
email;internet:rjm@zenucom.com
tel;cell:+61 411 287 530
x-mozilla-html:TRUE
version:2.1
end:vcard
|