OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Announce: XML Schema, The W3C's Object-Oriented Descriptio

[ Lists Home | Date Index | Thread Index ]

John Cowan wrote:
>> For example, at the ISO DSDL meeting we had a suggested requirement
>> from a large European publishing house that we can validate than a
>> mixed content element in the Dutch language should only contain
>> Dutch characters.
>
> An interesting idea. All schema languages known to me are weak in
> supporting mixed content; even RNG, which is the strongest, cannot
> express this constraint. (I don't really know Schematron: can it
> cope?)

Come XPath 2.0, and assuming that it supports regular expressions like
those in XML Schema, Schematron would be able to test this with
something like:

<sch:rule context="mixedContentElement[lang('nl')]">
  <sch:assert
    test="match(string(.),
                '^(/p{IsBasicLatin}|/p{IsLatin-1Supplement}|...)*$')">
    A Dutch-language <sch:name /> element must contain only Dutch
    characters.
  </sch:assert>
</sch:rule>

Before then, you could still use Schematron, but the test would be
pretty nasty -- create a string that contained the relevant characters
and then use translate() to test whether it contains any other
strings.

In general, Schematron's as good at mixed content as it is at
text-only content because XPath can give you access to the individual
text nodes between the elements in mixed content. For example, if you
had something like:

<length>12.5<unit>cm</unit></length>

in XML Schema you could state that the length element had mixed
content and had to contain a unit element:

<xs:element name="length">
  <xs:complexType mixed="true">
    <xs:element name="unit" type="lengthUnit" />
  </xs:complexType>
</xs:element>

in RELAX NG you could additionally state that the text had to come
before the unit element:

<element name="length">
  <text />
  <element name="unit"><ref name="lengthUnit" /></element>
</element>

and in Schematron you could state that the text node child of the
length element had to be a number (here using the fact that NaN !=
NaN):

<sch:rule context="length/text()">
  <sch:assert test="number(.) = number(.)">
    The text within the length element must be a number.
  </sch:assert>
</sch:rule>

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS