Re: [xml-dev] Best Practice: constrain an element's content by (1)a run-

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

Re: [xml-dev] Best Practice: constrain an element's content by (1)a run-time selection of alternate types or (2) a run-time selection of childelements using an XPath expression?

From: Oliver Hallam <oliver@xqsharp.com>
To: xml-dev@lists.xml.org
Date: Tue, 12 May 2009 11:22:55 +0100

The XPath expression you give in your second example makes me feel 
somewhat uncomfortable.  There is a lot of redundancy between the 
content model and the XPath expression, for example "empty(* except 
(Title[1],Date[1],Author,ISBN[1],Publisher[1]))" is exactly equivalent 
to the lack of an <xs:any> element in the schema, and so is completely 
redundant.

If I was going for the latter case, I would simplify the schema type as 
follows.

First I would move the condition that every publication has a Title and 
Date into the content model, rather than it being in the assertion.  I 
would then change the empty(* except ...) expressions into a list of the 
elements that are explicitly empty.

<xs:element name="Publication">
   <xs:complexType>
      <xs:sequence>
           <xs:element name="Title" type="xs:string" />
           <xs:element name="Author" type="xs:string" minOccurs="0"
                                                      
maxOccurs="unbounded"/>
           <xs:element name="Date" type="xs:gYear" />
           <xs:element name="ISBN" type="xs:string"  minOccurs="0"/>
           <xs:element name="Publisher" type="xs:string"  minOccurs="0"/>
      </xs:sequence>
      <xs:attribute name="kind" type="xs:string" />
      <xs:assert test="if (@kind eq 'book') then ISBN and Publisher
                       if (@kind eq 'magazine') then empty(Author | ISBN 
| Publisher)
                       else empty(ISBN | Publisher))" />
   </xs:complexType>
</xs:element>

The assertion then only covers the conditions that are not covered by 
the data model, and this (at least to me) is a lot more readable.

Having said that, my preference would still be to go for option #1.  
There is no need to restate the arguments made by Michael Kay.


Oliver Hallam
http://www.xqsharp.com


Michael Kay wrote:
> I think the best advice is probably: if you can do it conveniently using CTA
> (as you can here), then do. Otherwise use assertions.
>
> There are a number of reasons for this.
>
> (1) What is sometimes called the "rule of least power": don't use a chainsaw
> to snap a twig.
>
> (2) More concretely:
>
> (2a) A schema validator is more likely to adopt a streaming implementation
> for CTA than for assertions
>
> (2b) The schema validator is likely to produce better diagnostics if you
> describe the constraint using CTA
>
> (2c) You are likely to get a more precise type annotation on the element if
> you use CTA, which gives benefits when writing schema-aware stylesheets and
> queries.
>
> Regards,
>
> Michael Kay
> http://www.saxonica.com/
> http://twitter.com/michaelhkay 
>
>
>   
>> -----Original Message-----
>> From: Costello, Roger L. [mailto:costello@mitre.org] 
>> Sent: 11 May 2009 19:53
>> To: 'xml-dev@lists.xml.org'
>> Subject: [xml-dev] Best Practice: constrain an element's 
>> content by (1) a run-time selection of alternate types or (2) 
>> a run-time selection of child elements using an XPath expression?
>>
>>
>> Hi Folks,
>>
>> Consider this book publication:
>>
>>     <Publication kind="book">
>>         <Title>Everything is Miscellaneous</Title>
>>         <Author>David Weinberger</Author>
>>         <Date>2007</Date>
>>         <ISBN>0-8050-8811-3</ISBN>
>>         <Publisher>Henry Holt and Company, LLC</Publisher>
>>     </Publication>
>>
>>
>> Next, consider this magazine publication:
>>
>>     <Publication kind="magazine">
>>         <Title>Science News</Title>
>>         <Date>2005</Date>
>>     </Publication>
>>
>>
>> Notice the *kind* attribute in both examples.
>>
>> If its value is 'book' then the content of <Publication> is:
>>
>>     - Title
>>     - Author
>>     - Date
>>     - ISBN
>>     - Publisher
>>
>> And if its value is 'magazine' then the content of <Publication> is:
>>
>>     - Title
>>     - Date
>>
>>
>>
>> PROBLEM STATEMENT
>>
>> What is best practice for constraining the content of Publication?
>>
>>
>>
>> XML SCHEMA 1.1 PROVIDES TWO APPROACHES
>>
>> XML Schema 1.1 provides two approaches to constraining the 
>> content of the <Publication> element.
>>
>>
>>
>> APPROACH #1: ALTERNATE TYPES
>>
>> Create a BookType and a MagazineType and then select one of 
>> them to be Publication's type depending on @kind:
>>
>>    if @kind = 'book' then select BookType
>>    else select MagazineType
>>
>>
>> Here's how it is expressed in XML Schema 1.1:
>>
>>    <xs:element name="Publication" type="PublicationType">
>>       <xs:alternative test="@kind eq 'magazine'" 
>> type="MagazineType" />
>>       <xs:alternative test="@kind eq 'book'" type="BookType" />
>>    </xs:element>
>>
>>
>> You see the (new) <alternative> element being used to select 
>> a type for Publication based on the value of @kind. 
>>
>>
>> (I don't show the complexType definition for MagazineType and 
>> BookType because it's the same as in XML Schema 1.0.)
>>
>>
>>
>> APPROACH #2: XPATH EXPRESSION
>>
>> Let the content of Publication be a collection of all the 
>> elements (both book elements and magazine elements) and set 
>> them optional: 
>>
>>     - Title (0,1)
>>     - Author (0, unbounded)
>>     - Date (0,1)
>>     - ISBN (0,1)
>>     - Publisher (0,1)
>>
>>
>> Then create an XPath expression that selects the set of 
>> children for Publication depending on the value of @kind:
>>
>>    if (@kind eq 'book') then
>>       Title and Date and ISBN and Publisher and 
>>           empty(* except 
>> (Title[1],Date[1],Author,ISBN[1],Publisher[1]))
>>    else
>>       if (@kind eq 'magazine') then
>>          Title and Date and 
>>            empty(* except (Title[1],Date[1]))
>>       else
>>          true()
>>
>>
>> Here's how it is expressed in XML Schema 1.1:
>>
>> <xs:element name="Publication">
>>     <xs:complexType>
>>        <xs:sequence>
>>             <xs:element name="Title" type="xs:string"  minOccurs="0"/>
>>             <xs:element name="Author" type="xs:string" minOccurs="0"
>>                                                        
>> maxOccurs="unbounded"/>
>>             <xs:element name="Date" type="xs:gYear"  minOccurs="0"/>
>>             <xs:element name="ISBN" type="xs:string"  minOccurs="0"/>
>>             <xs:element name="Publisher" type="xs:string"  
>> minOccurs="0"/>
>>        </xs:sequence>
>>        <xs:attribute name="kind" type="xs:string" />
>>        <xs:assert test="if (@kind eq 'book') then
>>                           Title and Date and ISBN and Publisher and 
>>                           empty(* except 
>> (Title[1],Date[1],Author,ISBN[1],Publisher[1]))
>>                         else
>>                             if (@kind eq 'magazine') then
>>                                Title and Date and empty(* 
>> except (Title[1],Date[1]))
>>                             else
>>                                Title and Date and 
>>                                   empty(* except 
>> (Title[1],Date[1], Author))" />
>>     </xs:complexType>
>> </xs:element>
>>
>>
>> You see that the content of Publication is all the book and 
>> magazine elements and they are optional. 
>>
>> You see an XPath expression within the (new) <assert> element 
>> being used to constrain which child elements are allowed 
>> within Publication based on the value of @kind.
>>
>>
>>
>> TWO APPROACHES
>>
>> You have seen two ways of solving the problem of constraining 
>> the content of Publication:
>>
>> (a) Run-time selection of alternate types
>>
>> (b) Run-time selection of child elements using XPath
>>
>>
>>
>> DEFINITION OF "RUN-TIME" 
>>
>> By "run-time" I mean that the content of Publication is not 
>> determined until an instance document is validated against a schema.
>>
>>
>>
>> WHICH IS BEST PRACTICE?
>>
>> Which approach is best practice? 
>>
>> What are the pros and cons of each approach?
>>
>>
>> /Roger
>> ______________________________________________________________
>> _________
>>
>> XML-DEV is a publicly archived, unmoderated list hosted by 
>> OASIS to support XML implementation and development. To 
>> minimize spam in the archives, you must subscribe before posting.
>>
>> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
>> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
>> subscribe: xml-dev-subscribe@lists.xml.org List archive: 
>> http://lists.xml.org/archives/xml-dev/
>> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>>
>>     
>
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>

References:
- Best Practice: constrain an element's content by (1) a run-timeselection of alternate types or (2) a run-time selection of child elementsusing an XPath expression?
  - From: "Costello, Roger L." <costello@mitre.org>
- RE: [xml-dev] Best Practice: constrain an element's content by (1) a run-time selection of alternate types or (2) a run-time selection of child elements using an XPath expression?
  - From: "Michael Kay" <mike@saxonica.com>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]