XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] Schematron: Categories of Usage?

Thanks Stephen.  I have incorporated your thoughts in "Schematron
Features" below (please correct me if I have not correctly captured the
aspect of Schematron you are identifying):

Characterization of Schematron

A. Schematron Usage

Here are the ways that Schematron is being used today:

1. Co-constraint checking: co-constraints are constraints that exist
between data (element-to-element co-constraints, element-to-attribute,
attribute-attribute).  The co-constraints may be "within" an XML
document, or "across" XML documents.

2. Existence checking: existence constraints are constraints on the
presence or absence of data.  The existence constraints may apply over
the entire document, or to just portions of the document.

3. Algorithmic checking: the validity of data in an XML instance
document is determined not by mere examination or comparison of the
data, but requires performing an algorithm on the data.

B. Schematron Features

1. Author specified error messages:  Schematron allows the schema
author to write the error messages, thus the errors can be reported at
a higher (operational/user) level. The schema author can thus
communicate with the user and explain the error in an understandable
way and direct the user on how to correct the problem.

2. External Data Mashups: Data used in Schematron assertions may be
dynamically obtained from external files.

Excellent!

Any others!

/Roger

-----Original Message-----
From: Stephen Green [mailto:stephen.green@bristol.gov.uk] 
Sent: Monday, January 22, 2007 8:57 AM
To: xml-dev@lists.xml.org
Subject: RE: [xml-dev] Schematron: Categories of Usage?

Hi Roger

Universal Business Language is promoting use for enumeration checking
due to limitations in that area on extensibility by derivation in W3C
XML
Schema (inability to extend an enumeration using substitution groups or
redefine).
http://docs.oasis-open.org/ubl/os-UBL-2.0/val/defaultCodeList.xsl
It covers restriction of an enumeration list too of course. The
enumerations
are kept separate from the XSD schema files and validated using
Schematron.

All the best

Stephen Green


>>> "Costello, Roger L." <costello@mitre.org> 22/01/07 13:15:08 >>>
Excellent!  Thanks Bryan.

Bryan has identified another way that Schematron may be used for
checking data in an XML instance document:

Algorithmic Checking: the validity of data in an XML instance document
is determined not by mere examination or comparison of the data, but
requires performing an algorithm on the data.

Here are the ways that Schematron is being used today:

1. Co-constraint checking
2. Existence checking
3. Algorithmic checking

Any others?

/Roger


-----Original Message-----
From: bryan rasmussen [mailto:rasmussen.bryan@gmail.com] 
Sent: Monday, January 22, 2007 7:14 AM
To: Costello, Roger L.
Cc: xml-dev@lists.xml.org 
Subject: Re: [xml-dev] Schematron: Categories of Usage?

Algorithmic checking:

the following checks the algorithm of EAN Location numbers, after the
algorithm found here
http://www.ean.dk/EAN_Sys/helpdesk/faq/kntrlcif.htm#EAN%20Lokationsnumm

er
(sorry, it's in Danish):

<sch:rule context="*[@schemeID]">
    <sch:report test="@schemeID='EAN' and string-length(.) != 13">
WARNING: EAN numbers are 13 digits in length
</sch:report>
<sch:report test="@schemeID='EAN' and . != (. + 1) - 1">
WARNING: EAN numbers are 13 digits in length</sch:report>
<sch:report test="@schemeID='EAN' and substring(.,13,1)!=0 and ((((10
- substring((substring(.,1,1) * 1 + substring(.,2,1) * 3) +
(substring(.,3,1) * 1 + substring(.,4,1) * 3) + (substring(.,5,1) * 1
+ substring(.,6,1) * 3) + (substring(.,7,1) * 1 + substring(.,8,1) *
3) + (substring(.,9,1) * 1 + substring(.,10,1) * 3) +
(substring(.,11,1) * 1 + substring(.,12,1) *
3),string-length((substring(.,1,1) * 1 + substring(.,2,1) * 3) +
(substring(.,3,1) * 1 + substring(.,4,1) * 3) + (substring(.,5,1) * 1
+ substring(.,6,1) * 3) + (substring(.,7,1) * 1 + substring(.,8,1) *
3) + (substring(.,9,1) * 1 + substring(.,10,1) * 3) +
(substring(.,11,1) * 1 + substring(.,12,1) * 3)),1)) +
((substring(.,1,1) * 1 + substring(.,2,1) * 3) + (substring(.,3,1) * 1
+ substring(.,4,1) * 3) + (substring(.,5,1) * 1 + substring(.,6,1) *
3) + (substring(.,7,1) * 1 + substring(.,8,1) * 3) + (substring(.,9,1)
* 1 + substring(.,10,1) * 3) + (substring(.,11,1) * 1 +
substring(.,12,1) * 3))) - ((substring(.,1,1) * 1 + substring(.,2,1) *
3) + (substring(.,3,1) * 1 + substring(.,4,1) * 3) + (substring(.,5,1)
* 1 + substring(.,6,1) * 3) + (substring(.,7,1) * 1 + substring(.,8,1)
* 3) + (substring(.,9,1) * 1 + substring(.,10,1) * 3) +
(substring(.,11,1) * 1 + substring(.,12,1) * 3))) != substring(.,13,1)
)">
there is an improperly formatted EAN number.


</sch:report>
<sch:report test="@schemeID='EAN' and substring(.,13,1) =0 and
substring((substring(.,1,1) * 1 + substring(.,2,1) * 3) +
(substring(.,3,1) * 1 + substring(.,4,1) * 3) + (substring(.,5,1) * 1
+ substring(.,6,1) * 3) + (substring(.,7,1) * 1 + substring(.,8,1) *
3) + (substring(.,9,1) * 1 + substring(.,10,1) * 3) +
(substring(.,11,1) * 1 + substring(.,12,1) *
3),string-length((substring(.,1,1) * 1 + substring(.,2,1) * 3) +
(substring(.,3,1) * 1 + substring(.,4,1) * 3) + (substring(.,5,1) * 1
+ substring(.,6,1) * 3) + (substring(.,7,1) * 1 + substring(.,8,1) *
3) + (substring(.,9,1) * 1 + substring(.,10,1) * 3) +
(substring(.,11,1) * 1 + substring(.,12,1) * 3)),1) != 0">
there is an improperly formatted EAN number.
</sch:report>

don't worry, verbosity isn't a concern in XML.  :)

The same principals can be used to implement a great number of
algorithms where the boundaries of the problem are know, as in this
case I know that this sequence is 13 characters in length, not less
nor more.

Actually because of the way schematrons assert works one can do checks
on sequences where the possible upper bound is known but not if the
upper bound is actually reached.

I did a proof of this recently (generated the code of course, it took
86 assertions to implement the check), the requirement was that for a
text string the space between each linefeed was no longer than 37
characters, and there could not be more than 45 linefeeds.

The generated assertions were of course that the string-length of the
string between line feed 1 and 2 was less than 38.
the string-length of the string between line feed 2 and 3 was less
than 38 and so forth.

If there were only two line feeds the other assertions did not return
false due to wording.

It took 86 assertions because I split on if the ending line had to end
with a line feed. Unfortunately my laptop burnt out (nothing to do
with this example) and I hadn't backed it up because it was a sort of
a fun experiment. Not for actual use.

This was in Schematron 1.5 not Iso, it would be a lot easier to write
this stuff in ISO. Of course others out there could probably optimize
the code, but it has been checking EAN numbers for a year and a half
now and nobody has submitted an error yet. (fingers crossed)

Cheers,
Bryan Rasmussen





On 1/22/07, Costello, Roger L. <costello@mitre.org> wrote:
> Hi Folks,
>
> I am putting together a list of ways that Schematron is being used.
I
> seek your help in ensuring that the list is complete. (I will post
the
> final list)
>
> Let me give an example to show what I mean by "ways that Schematron
is
> being used".
>
> Consider this simple XML instance document:
>
> <?xml version="1.0"?>
> <Document>
>      <Classification>unclassified</Classification>
>      <Para>
>           Lorem ipsum dolor sit amet,
>           laoreet ac convallis dictumst
>      </Para>
>      <Classification>unclassified</Classification>
> </Document>
>
> Schematron can be used to specify, "The Classification value at the
top
> and bottom of the document must match; the Para element must not
> contain any restricted keywords."
>
> Thus, we see Schematron being used to express these two types of data
> constraints:
>
> 1. Co-constraints: in the example the co-constraint is between the
two
> Classification values; namely, the two values must be identical.  In
> general, co-constraints are constraints that exist between data
> (element-to-element co-constraints, element-to-attribute,
> attribute-attribute).  The co-constraints may be "within" an XML
> document, or "across" XML documents.
>
> Schematron is very well-suited to expressing co-constraints.
>
> 2. Existence: in the example the existence constraint is that the
Para
> element must not contain any restricted keywords.  The keywords may
be
> obtained dynamically from another file. In general, existence
> constraints are constraints on the presence or absence of data.  The
> existence constraints may apply over the entire document, or to just
> portions of the document.
>
> Schematron is very well-suited to expressing existence constraints.
>
> Categories of Schematron Usage
>
> Here are the ways that Schematron is being used today:
>
> 1. Co-constraint checking
> 2. Existence checking
>
> Are you using Schematron in ways not represented by these two
> categories?  I am particularly interested in identifying ways
> Schematron is being used which cannot be expressed by other schema
> languages - XML Schemas, Relax NG.
>
> /Roger
>
>
_______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ 
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org 
> subscribe: xml-dev-subscribe@lists.xml.org 
> List archive: http://lists.xml.org/archives/xml-dev/ 
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php 
>
>

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ 
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org 
subscribe: xml-dev-subscribe@lists.xml.org 
List archive: http://lists.xml.org/archives/xml-dev/ 
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php 


_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS