XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] XML Schema: "Best used with the ______ tool"

Hi Michael,

Michael Kay <mike@saxonica.com> writes:

> Running this from the Saxon command line, with the query as written, I get
> the following results:
> 
> Query compile time: 161ms
> 
> Document parsing/build time: 125ms
> 
> Query execution time (average over 100 queries):
>   age=1, gender=male: 12.5ms
>   age=100, gender=male: 34.9ms
> 
> I wasn't sure what query parameters you were using.
>
> [...]
>
> Removing the string concatenation and just returning the nodes (without
> serialization) reduces the query times to 12.3ms and 17.7ms respectively.

The test runs 100 queries with various max_age values. I was hoping
you would implement the same test in Java using Saxon instead of
simply running the query from the command line. I don't think it
is meaningful to compare the result of the benchmark to these 
numbers since they are representing only one query and are missing
a substantial part of the benchmark, notably interfacing with the
non-XML data sources.

Even if we ignore all this (as well as that you have a faster CPU),
your average query execution time is 15.0ms vs 0.9ms for data binding,
which makes data binding over 15 times faster on this benchmark.



> Also, this kind of query would often benefit from document projection,
> but because the data in the file is so artificially minimal, projection
> yields no benefits.

I don't think document projection will be helpful here since we are
considering the situation where the data is accessed multiple times.
Re-parsing XML every time surely cannot be faster.


> I haven't tried making the query schema-aware, but I did try modifying it to
> do an integer comparison on age ($x/xs:integer(@age) < $age) and this
> reduces the execution times marginally, to 12.0ms and 17.2ms respectively.

Do you expect an average user to do this kind of optimizations?


> Now, it's not clear to me here what I'm comparing with. You reported a
> figure of 0.09s for the data binding test: I assume that is 100 executions
> of the query (with what parameters, though?) and excludes the XML
> parsing/marshalling cost?

The test executes 100 queries with varying genders and max ages. The
time measurement excludes XML parsing and includes passing input
parameters to the query and extracting the result. Check the xquery.cxx
file for details.


> What I would actually expect is that for both technologies, the parsing
> cost (125ms in my case) is dominant in many real-world situations.

That would be the case if we ran only one query. But remember, we are
testing repetitive access to the data. So running 100 queries will
take 1.5s which is 12x the time it takes to parse the XML document.


> If the query execution cost turns out to be 12ms for XQuery vs 0.9ms
> for data binding, then I think very few applications are going to
> notice the difference. 

Surely applications that run multiple queries will notice the 15 times
speedup.


> (Remember, I never said it would be faster: I was challenging Dennis's
> assertion that XQuery performance was not good enough to meet the user
> requirement.)

As far as I remember, I stated that in a scenario with repetitive 
access to most of the data, data binding will have an advantage. You
asked for evidence and I believe I have shown that it can certainly
be the case. Here are the relevant quotes:


Michael Kay <mike@saxonica.com> writes:

> Boris Kolpackov <boris@codesynthesis.com> writes:
>
> > I agree with Dennis here in that XQuery can be usable when
> > you need to access a small subset of an XML document.
> > However, when one needs to access most of the data, or,
> > worse, access the same data many times, data binding will
> > have speed/memory advantage.
>
> Evidence please! I don't see any reason why it should.


Boris

-- 
Boris Kolpackov, Code Synthesis Tools   http://codesynthesis.com/~boris/blog
Open source XML data binding for C++:   http://codesynthesis.com/products/xsd
Mobile/embedded validating XML parsing: http://codesynthesis.com/products/xsde


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS