OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


I was thinking, in the light of James' recent valuable comments that it
would be interesting to get some benchmarks of validating with different
schema languages.

However, of course, we can only benchmark implementations not languages (see
http://lists.w3.org/Archives/Public/xmlschema-dev/2000Dec/0158.html for an
interesting example of an innocent schema that causes FSM
explosions--however this can be optimized away) so I am wondering how to
avoid comparing apples with oranges.  Anyone have any ideas?

I think the goal of attempting to benchmark implementations of different
schema languages must be initially to identify pathological cases that
implementors should look out for and structures that schema schema-writers
in the particular language or implementation should avoid. In other words,
whether a schema takes 1 second or 10 normally is not so important, but
whether there are structures or conditions that will make it blow out to 100
seconds or 1000 is important.

But now it occurs to me that perhaps there are some basic questions we can
ask (of a schema implementer or language designer) which may give a head
start in the absense of benchmarking:

 - Are there any innocent-looking structures that explode (or may explode)
(perhaps this is the same as asking are there any constructs which, when
used, may have more than an O(n) effect on performance) and what are the
workarounds (e.g. in XML Schemas case, to detect unbounded particles in
unbounded choice groups) ?

 - How is schema evaluation affected by a slow network or unavailable
(e.g. to use local caching, to give a user option to progress with
validation as far as possible even if some components are missing, etc.) ?

Rick Jelliffe