[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
DTD vs XSD: No Duplicate Types in (Mixed) Content Models
- From: =?UTF-8?B?TWFpayBTdMO8aHJlbmJlcmc=?= <maik.stuehrenberg@uni-bielefeld.de>
- To: xml-dev@lists.xml.org, xmlschema-dev-request@w3.org
- Date: Mon, 31 Jan 2011 13:56:26 +0100
Hello,
first of all, apologies for cross-posting this question to both the
xml-dev and the xmlschema-dev list. But since the question is related
both to XML DTDs (defined in the XML spec) and XML Schema I hope it's ok.
One of my students asked me if it wouldn't be easy to simulate XML
Schema's minOccurs and maxOccurs by repeating the same element type
(element name) in the content model of its parent element, e.g.:
<!ELEMENT a (b, b, b)>
<!ELEMENT b EMPTY>
would corrrespond to
<xs:element name="a">
<xs:complexType>
<xs:sequence>
<xs:element ref="b" minOccurs="3" maxOccurs="3"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="b"/>
or, as a 1:1 implementation:
<xs:element name="a">
<xs:complexType>
<xs:sequence>
<xs:element ref="b"/>
<xs:element ref="b"/>
<xs:element ref="b"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="b"/>
All three solutions were tolerated by the parsers I've tried. For
simulating minOccurs="3", maxOccurs="5" one could use the following DTD
solution:
<!ELEMENT a (b, b, b, b?, b?)>
<!ELEMENT b EMPTY>
To my suprise, I had no problems with the parsers I've tried. However,
such a content model seems to be not allowed in XSD (as expected because
of the determinstic content models constraint):
<xs:element name="a">
<xs:complexType>
<xs:sequence>
<xs:element ref="b"/>
<xs:element ref="b"/>
<xs:element ref="b"/>
<xs:element ref="b" minOccurs="0" maxOccurs="1"/>
<xs:element ref="b" minOccurs="0" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="b"/>
Of course it would be possible to use <xs:element ref="b" minOccurs="3"
maxOccurs="5"/> to construct this content model.
In addition, the XML spec defines that in mixed content the same element
name must not appear more than once (Validity Constraint: No Duplicate
Types (XML Spec 5th ed, 3.2.2), therefore the following DTD construct is
not allowed:
<!ELEMENT a (#PCDATA | b, b)*>
<!ELEMENT b EMPTY>
But it seems to be ok in XSD:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="a">
<xs:complexType mixed="true">
<xs:sequence>
<xs:element ref="b"/>
<xs:element ref="b"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="b"/>
</xs:schema>
So, if I get it straight, it seems that the (non-normative) requirement
for determinstic content models is only partially implemented (only for
mixed content models) in both DTD and XSD and both document grammar
formalisms differ in handling this (under the assumption that the
parsers are correct).
The interesting case is, if I have exact four occurrences of the element
b in an instance, resulting that my parser cannot know which b in the
model is being matched. Was this behaviour expected? Or does the
requirements for deterministic content rules is only applied when
dealing with different element names (such in the a, (b|c) vs.
(a|b),(a|c) example)?
Best regards,
Maik Stührenberg
--
Maik Stührenberg, M.A.
Universität Bielefeld
Fakultät für Linguistik und Literaturwissenschaft
Universitätsstraße 25
33615 Bielefeld
Telefon: +49 (0)521/106-2534
E-Mail: maik.stuehrenberg@uni-bielefeld.de
http://www.maik-stuehrenberg.de
http://www.xstandoff.net
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]