xml-dev - Re: [xml-dev] content model question

Re: [xml-dev] content model question

[ Lists Home | Date Index | Thread Index ]

To: <xml-dev@lists.xml.org>
Subject: Re: [xml-dev] content model question
From: "Rick Jelliffe" <ricko@allette.com.au>
Date: Mon, 8 Apr 2002 16:07:16 +1000
References: <Pine.SOL.4.21.0204051335570.15499-100000@sun8.loc.gov>

From: "Morgan V. Cundiff" <mcundiff@loc.gov>
 
> Thanks for your reply. I was afraid this might be the case. (It is a given
> that our project will use XML Schema and not one of the alternatives.)

The way to solve this kind of problems if you must use XML Schemas is 
to use the <annotation><appinfo> elements. Then you embed a Schematron 
assertion. Appinfo was provided to allow constraints that go beyond XML
Schemas. 
See Eddie Robertsson's article "Combining the power of W3C 
XML Schema and Schematron" at
http://www.topologi.com/public/Schtrn_XSD/Paper.html

To validate, you can make you a script using three XSLT transforms and
open source code (one to extract the constraints, one to compile the constraints,
one to run the constraints) or, if you are on Windows, download the free
Topologi Schematron Validator at http://www.topologi.com/

This kind of content model was available in SGML:
    <!ELEMENT myelement ( #PCDATA, subelement1, subelement2)>
however that kind of content model is not available in XML DTDs or XML Schemas.

One good reason is that usually it means that you have some structure that you want
to elide: that the initial textblob has some significance but you don't want to
tag it. This goes against one thrust in XML, that terseness is not catered for:
if you need terser markup, you need to go away from W3C DBMS approaches more
to the publishing side of the family (ISO, OASIS) of markup standards: RELAX NG,
DSDL, SGML etc.   

Having said that it may be bad modeling, it can undoubtedly be idiomatic markup:
blocking PCDATA from between certain elements may fit in with the way we
like to think about things. And who says terseness is always of minimal importance
anyway?  The XML goal of terseness is primarily about documents being transmitted
over the WWW where other layers are known, not documents being created or maintained
or read.   Two well-known document types which have some kind of data restrictions
are the original TEI and XSLT AFAIK. 

Cheers
Rick Jelliffe

Follow-Ups:
- Re: [xml-dev] content model question
  - From: "Morgan V. Cundiff" <mcundiff@loc.gov>

References:
- RE: [xml-dev] content model question
  - From: "Morgan V. Cundiff" <mcundiff@loc.gov>

Prev by Date: XPath problem with selectSingleNode in MSXML4
Next by Date: RE: [xml-dev] UDDI, RDF, semweb?
Previous by thread: RE: [xml-dev] content model question
Next by thread: Re: [xml-dev] content model question
Index(es):
- Date
- Thread