XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
DTD vs XSD: No Duplicate Types in (Mixed) Content Models

Hello,

first of all, apologies for cross-posting this question to both the 
xml-dev and the xmlschema-dev list. But since the question is related 
both to XML DTDs (defined in the XML spec) and XML Schema I hope it's ok.
One of my students asked me if it wouldn't be easy to simulate XML 
Schema's minOccurs and maxOccurs by repeating the same element type 
(element name) in the content model of its parent element, e.g.:

<!ELEMENT a (b, b, b)>
<!ELEMENT b EMPTY>

would corrrespond to

<xs:element name="a">
   <xs:complexType>
     <xs:sequence>
       <xs:element ref="b" minOccurs="3" maxOccurs="3"/>
     </xs:sequence>
   </xs:complexType>
</xs:element>
<xs:element name="b"/>

or, as a 1:1 implementation:

<xs:element name="a">
   <xs:complexType>
     <xs:sequence>
       <xs:element ref="b"/>
       <xs:element ref="b"/>
       <xs:element ref="b"/>
     </xs:sequence>
   </xs:complexType>
</xs:element>
<xs:element name="b"/>

All three solutions were tolerated by the parsers I've tried. For 
simulating minOccurs="3", maxOccurs="5" one could use the following DTD 
solution:

<!ELEMENT a (b, b, b, b?, b?)>
<!ELEMENT b EMPTY>

To my suprise, I had no problems with the parsers I've tried. However, 
such a content model seems to be not allowed in XSD (as expected because 
of the determinstic content models constraint):

<xs:element name="a">
   <xs:complexType>
     <xs:sequence>
       <xs:element ref="b"/>
       <xs:element ref="b"/>
       <xs:element ref="b"/>
       <xs:element ref="b" minOccurs="0" maxOccurs="1"/>
       <xs:element ref="b" minOccurs="0" maxOccurs="1"/>
     </xs:sequence>
   </xs:complexType>
</xs:element>
<xs:element name="b"/>

Of course it would be possible to use <xs:element ref="b" minOccurs="3" 
maxOccurs="5"/> to construct this content model.

In addition, the XML spec defines that in mixed content the same element 
name must not appear more than once (Validity Constraint: No Duplicate 
Types (XML Spec 5th ed, 3.2.2), therefore the following DTD construct is 
not allowed:

<!ELEMENT a (#PCDATA | b, b)*>
<!ELEMENT b EMPTY>

But it seems to be ok in XSD:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema";>
   <xs:element name="a">
     <xs:complexType mixed="true">
       <xs:sequence>
         <xs:element ref="b"/>
         <xs:element ref="b"/>
       </xs:sequence>
     </xs:complexType>
   </xs:element>
   <xs:element name="b"/>
</xs:schema>

So, if I get it straight, it seems that the (non-normative) requirement 
for determinstic content models is only partially implemented (only for 
mixed content models) in both DTD and XSD and both document grammar 
formalisms differ in handling this (under the assumption that the 
parsers are correct).
The interesting case is, if I have exact four occurrences of the element 
b in an instance, resulting that my parser cannot know which b in the 
model is being matched. Was this behaviour expected? Or does the 
requirements for deterministic content rules is only applied when 
dealing with different element names (such in the a, (b|c) vs. 
(a|b),(a|c) example)?

Best regards,

Maik Stührenberg

-- 

Maik Stührenberg, M.A.

Universität Bielefeld
Fakultät für Linguistik und Literaturwissenschaft
Universitätsstraße 25
33615 Bielefeld

Telefon: +49 (0)521/106-2534
E-Mail: maik.stuehrenberg@uni-bielefeld.de

http://www.maik-stuehrenberg.de
http://www.xstandoff.net



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS