[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Schematron Best Practice: Embed Schematron into a Grammar-Based Language? Or Keep Separate?
- From: "Costello, Roger L." <costello@mitre.org>
- To: <xml-dev@lists.xml.org>
- Date: Mon, 23 Jul 2007 07:24:38 -0400
Hi Folks,
I would like to begin the next Schematron Best Practice issue. Below
is the issue. I have made a start on addressing the issue, including a
preliminary recommendation. I invite you to add to the list of
advantages and disadvantages, and to enhance/modify the recommendation.
/Roger
ISSUE
You have a set of data validation requirements for your system. You
have decided to implement the requirements using a combination of a
grammar-based language (e.g. Relax NG or XML Schema) plus Schematron.
Should the Schematron implementation be embedded within the grammar
document, or should the Schematron implementation be in a separate
document from the grammar document?
EXAMPLE
Suppose this XML instance document is representative of the type of
data that your system exchanges:
<?xml version="1.0"?>
<Document classification="secret">
<Para classification="unclassified">
One if by land; two if by sea.
</Para>
</Document>
And suppose your system's data requirements are:
1. The <Para> classification value cannot be more sensitive than the
<Document> classification value (top-secret is more sensitive than
secret, which is more sensitive than confidential, which is more
sensitive than unclassified).
2. The <Document> element must have a classification attribute, whose
value is either top-secret, secret, confidential, or unclassified.
3. The <Para> element must have a classification attribute, whose value
is either top-secret, secret, confidential, or unclassified.
The first requirement will be implemented using Schematron. The next
two requirements will be implemented using XML Schemas.
There are two alternatives:
A. Create two documents: one document for the Schematron
implementation, and a second document for the XML Schema
implementation.
B. Create one document: the Schematron patterns, rules, and assertions
are embedded within <appinfo> elements in the XML Schema.
ADVANTAGES/DISADVANTAGES OF SEPARATE SCHEMATRON AND GRAMMAR DOCUMENTS
ADVANTAGES
1. The particular grammar language currently being used can be easily
replaced. Thus, if XML Schema is currently being used, at a later date
you can easily replace it with Relax NG without impacting the
Schematron schema.
2. Constraint checking can be done in stages, in a pipeline fashion.
It might be desirable for your system to implement constraint checks in
phases - first do grammar checking, then do something, then do
co-constraint checking (using Schematron) then do something, then do
data cardinality checking (using Schematron), then do something, then
do algorithmic checking (using Schematron).
3. There may be a performance improvement. Suppose grammar checking is
done first and suppose it fails (i.e. outputs errors) then it may not
be necessary to execute the Schematron validation; thus there is a time
savings.
DISADVANTAGES
1. There may be a performance degradation. Running several validations
rather than a single validation may be more expensive.
ADVANTAGES/DISADVANTAGES OF SCHEMATRON EMBEDDED WITHIN A GRAMMAR
DOCUMENT
ADVANTAGES
1. There may be a performance improvement. Running one validation
rather than several validations may yield a savings in performance.
DISADVANTAGES
1. Swapping out the particular grammar language that is currently being
used and replacing it with a different grammar language may be
difficult since the two are tightly intertwined.
2. Constraint checking is a big-bang event. All constraints --
grammar, co-constraints, cardinality, algorithmic -- are checked at
once.
3. There may be a performance degradation. It is not possible to take
advantage of omitting Schematron validation when grammar validation
fails.
RECOMMENDATION
For maximum flexibility and long-term maintainability, keep the
Schematron schema separate from the grammar schema.
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]