XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] Schematron Best Practice: A Schematron schema's area ofresponsibility?


> Mark Delaney asks:
>
>> Are there order-of-magnitude variations in efficiency, in either memory
>> use or time, between alternative languages? If so, are these variations
>> essential, or merely a quirk of the available implementations?

There is probably order-of-magnitude differences between alternative
implementations of the same language, let alone between languages!
(Certainly this is true with XSLT-based systems.)

The primary issue is whether a constaint can be tested in
 1) Streaming order with no state saved
 2) Streaming order with state (or values) saved (e.g. ID checking)
 3) random access

What grammar-based schema language do is limit themselves to 1), then also
allow some convenient number of 2) where some abstraction can be used to
make it coherent why the grammar model has been sidestepped (ALL, ID,
etc).  What the default Schematron implementation does is start from 3)
using XSLT1, then allow implementations to figure out optimizations if
random access is not allowed.  For example, an implementation could split
up a schema so that the streamable constraints are tested first (e.g. as
the DOM is being built), then the random-access constraints are checked
when the DOM is ready.

More than this, ISO DSDL looks like adopting the STX streaming XPath
language. When Schematron is used with this, then you certainly get a
streaming implementation that would not have object creation overhead.

The other aspect is that you tend to express different things in
difference schema languages: the grammars force you to pay a lot of
attention to sequencing issues and are at all good with partial orders.
Ticking through a big state machine is very easy, but when the state
transitions don't reflect business requirements they may be a burden and a
cost. Furthermore, the grammars actively discourage separation of
concerns: each stakeholder, agent and process in the pipeline may have
different, uncordinated and independent constraints. Grammars, notatbly
XSD, have proved themselves to be unattractive for validation: people
choose not to validate because with XSD and grammars they have to
over-validate (validate things they are not interested in, and omit to
validate things they are interested in) without getting useable
diagnostics. So when considering efficiency, are systems that promote, in
effect, no validation actually more "efficient" than systems that promote
effective partial validation...

Cheers
Rick Jelliffe


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS