XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Schematron Best Practice: Embed Schematron into a Grammar-Based Language? Or Keep Separate?

I would certainly advocate separating the grammer and rules based valdation. For example you might want to run some rules first to ensure that you don't get a false positive from the grammer checks. For example if the namespace bindings in the XML instance were incorrect, doing XSD validation may have the appearance that it has been succesful, when in fact nothing was actually checked. We normally run some pre-schema checks before-hand.
 
Fraser.

 
On 23/07/07, Costello, Roger L. <costello@mitre.org> wrote:
Hi Folks,

I would like to begin the next Schematron Best Practice issue.  Below
is the issue.  I have made a start on addressing the issue, including a
preliminary recommendation.  I invite you to add to the list of
advantages and disadvantages, and to enhance/modify the recommendation.
/Roger


ISSUE

You have a set of data validation requirements for your system.  You
have decided to implement the requirements using a combination of a
grammar-based language (e.g. Relax NG or XML Schema) plus Schematron.
Should the Schematron implementation be embedded within the grammar
document, or should the Schematron implementation be in a separate
document from the grammar document?


EXAMPLE

Suppose this XML instance document is representative of the type of
data that your system exchanges:

       <?xml version="1.0"?>
       <Document classification="secret">
             <Para classification="unclassified">
                  One if by land; two if by sea.
             </Para>
       </Document>


And suppose your system's data requirements are:

1. The <Para> classification value cannot be more sensitive than the
<Document> classification value (top-secret is more sensitive than
secret, which is more sensitive than confidential, which is more
sensitive than unclassified).

2. The <Document> element must have a classification attribute, whose
value is either top-secret, secret, confidential, or unclassified.

3. The <Para> element must have a classification attribute, whose value
is either top-secret, secret, confidential, or unclassified.

The first requirement will be implemented using Schematron.  The next
two requirements will be implemented using XML Schemas.

There are two alternatives:

A. Create two documents: one document for the Schematron
implementation, and a second document for the XML Schema
implementation.

B. Create one document: the Schematron patterns, rules, and assertions
are embedded within <appinfo> elements in the XML Schema.


ADVANTAGES/DISADVANTAGES OF SEPARATE SCHEMATRON AND GRAMMAR DOCUMENTS

ADVANTAGES

1. The particular grammar language currently being used can be easily
replaced.  Thus, if XML Schema is currently being used, at a later date
you can easily replace it with Relax NG without impacting the
Schematron schema.

2. Constraint checking can be done in stages, in a pipeline fashion.
It might be desirable for your system to implement constraint checks in
phases - first do grammar checking, then do something, then do
co-constraint checking (using Schematron) then do something, then do
data cardinality checking (using Schematron), then do something, then
do algorithmic checking (using Schematron).

3. There may be a performance improvement. Suppose grammar checking is
done first and suppose it fails (i.e. outputs errors) then it may not
be necessary to execute the Schematron validation; thus there is a time
savings.

DISADVANTAGES

1. There may be a performance degradation.  Running several validations
rather than a single validation may be more expensive.


ADVANTAGES/DISADVANTAGES OF SCHEMATRON EMBEDDED WITHIN A GRAMMAR
DOCUMENT

ADVANTAGES

1. There may be a performance improvement.  Running one validation
rather than several validations may yield a savings in performance.

DISADVANTAGES

1. Swapping out the particular grammar language that is currently being
used and replacing it with a different grammar language may be
difficult since the two are tightly intertwined.

2. Constraint checking is a big-bang event.  All constraints --
grammar, co-constraints, cardinality, algorithmic -- are checked at
once.

3. There may be a performance degradation. It is not possible to take
advantage of omitting Schematron validation when grammar validation
fails.


RECOMMENDATION

For maximum flexibility and long-term maintainability, keep the
Schematron schema separate from the grammar schema.

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS