OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] XML Performance in a Transacation and recursion

[ Lists Home | Date Index | Thread Index ]
  • To: Peter Hunsberger <peter.hunsberger@gmail.com>
  • Subject: Re: [xml-dev] XML Performance in a Transacation and recursion
  • From: Rick Marshall <rjm@zenucom.com>
  • Date: Mon, 03 Apr 2006 10:00:20 +1000
  • Cc: Michael Kay <mike@saxonica.com>, xml-dev@lists.xml.org, daniel@veillard.com
  • In-reply-to: <cc159a4a0603270636x6ca4a96cv12390632272531e0@mail.gmail.com>
  • Organization: Zenucom Pty Ltd
  • References: <200603260900.k2Q902ET031627@zmail.zenucom.com> <44269384.3060308@zenucom.com> <cc159a4a0603270636x6ca4a96cv12390632272531e0@mail.gmail.com>
  • User-agent: Mozilla Thunderbird 1.0.7-1.1.fc4 (X11/20050929)

i've found the performance problem, and it ties together with the 
discussion on recursion.

here's the problem - the stylesheet vocabulary is used to write a 
postscript program. but postscript strings can't contain '(' or ')' (the 
delimiters for a postscript string).

so every output string has to be parsed and the parentheses escaped with 
a '\'.

we have no control over the source of the documents so i tried the 
recursive examples for substituting one string with another through a 
string. eg "ABC(DEF" ends up as "ABC\(DEF" and "AB(DEF)GHI()JK(" ends up 
as "AB\(DEF\)GHI\(\)JK\(".

i've solved my performance problem for the moment by preprocessing the 
input with sed instead.

but i'm not happy because sed has no knowledge of the dom and blindly 
applies the transformation, instead of only applying it to the content 
of elements.

so here's a real challenge - write a template for the above 
transformation with an example on how to call it; i'll put it into the 
style sheet and test it against the examples and we'll find out what 
techniques are linear or better and what ones aren't.

the solution (in this case) must work with xsltproc.

currently on a 410k input document 41 of the 43 seconds of processing 
time is taken up by the string escaping function.

writers of xsl processors can then compare their performance results 
over the various techniques as well.



Peter Hunsberger wrote:

>On 3/26/06, Rick Marshall <rjm@zenucom.com> wrote:
>>Michael Kay wrote:
>>>>o(n2) is what you get when something is wrong.
>>>No, there are many problems for which no solution exists that is better than
>>>O(n^2) - in any language.
>>and it's a problem, because those problems can't be scaled, but instead
>>have to be tackled by decomposition (at least you then get linear response)
>Think about what you're saying here:  if you can break the data apart
>by hand and  get linear processing times you've demonstrated that
>there is an algorithm that has linear processing times. The issue is
>coding it up...
>Peter Hunsberger
fn:Rick  Marshall
tel;cell:+61 411 287 530


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS