[
Lists Home |
Date Index |
Thread Index
]
John Cowan wrote:
>> For example, at the ISO DSDL meeting we had a suggested requirement
>> from a large European publishing house that we can validate than a
>> mixed content element in the Dutch language should only contain
>> Dutch characters.
>
> An interesting idea. All schema languages known to me are weak in
> supporting mixed content; even RNG, which is the strongest, cannot
> express this constraint. (I don't really know Schematron: can it
> cope?)
Come XPath 2.0, and assuming that it supports regular expressions like
those in XML Schema, Schematron would be able to test this with
something like:
<sch:rule context="mixedContentElement[lang('nl')]">
<sch:assert
test="match(string(.),
'^(/p{IsBasicLatin}|/p{IsLatin-1Supplement}|...)*$')">
A Dutch-language <sch:name /> element must contain only Dutch
characters.
</sch:assert>
</sch:rule>
Before then, you could still use Schematron, but the test would be
pretty nasty -- create a string that contained the relevant characters
and then use translate() to test whether it contains any other
strings.
In general, Schematron's as good at mixed content as it is at
text-only content because XPath can give you access to the individual
text nodes between the elements in mixed content. For example, if you
had something like:
<length>12.5<unit>cm</unit></length>
in XML Schema you could state that the length element had mixed
content and had to contain a unit element:
<xs:element name="length">
<xs:complexType mixed="true">
<xs:element name="unit" type="lengthUnit" />
</xs:complexType>
</xs:element>
in RELAX NG you could additionally state that the text had to come
before the unit element:
<element name="length">
<text />
<element name="unit"><ref name="lengthUnit" /></element>
</element>
and in Schematron you could state that the text node child of the
length element had to be a number (here using the fact that NaN !=
NaN):
<sch:rule context="length/text()">
<sch:assert test="number(.) = number(.)">
The text within the length element must be a number.
</sch:assert>
</sch:rule>
Cheers,
Jeni
---
Jeni Tennison
http://www.jenitennison.com/
|