OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Constrain the Number of Occurrences of Elements in your XM

[ Lists Home | Date Index | Thread Index ]

Someone mentioned the word 'combinatorial', but I couldn't find the e-mail 
it related to.  In case it refers to a different combinatorial to the one 
I'm thinking of,...

the problem I've encountered with fixing limits is when multiple nested 
structures have non-unity cardinality.  For example, a book may have a 
number of volumes, which have a number of chapters, that have a number of 
sections, that have a number of sub-sections, that have a number of 
paragraphs, and which have a limit on the maximum number of characters in a 
paragraph (e.g. a string constraint).

If you assign a reasonable number to each level, by the time you multiply it 
all out, you get a huge number that probably doesn't help you.  If you try 
to limit the maximum size to something useful, your 'local' limits end up 
too small to be useful.

This obviously isn't a problem in every case, but I find it surprising how 
quickly such combinatorial problems take effect.

On the other hand, a case for fixed limits might by a CPU state, such as the 
number of registers in an X86.

Pete.
--
=============================================
Pete Cordell
Tech-Know-Ware Ltd
                         for XML to C++ data binding visit
                         http://www.tech-know-ware.com/lmx
                         (or http://www.xml2cpp.com)
=============================================

----- Original Message ----- 
From: "Roger L. Costello" <costello@mitre.org>
To: "'XML Developers List'" <xml-dev@lists.xml.org>
Sent: Friday, August 05, 2005 6:52 PM
Subject: [xml-dev] Constrain the Number of Occurrences of Elements in your 
XML Schema


> Hi Folks,
>
> Below I have jotted down a few thoughts regarding XML Schemas which permit
> an unbounded number of occurrences.  Namely, I recommend against using
> maxOccurs="unbounded" in an XML Schema.  I am interested in hearing your
> thoughts on this.  /Roger
>
>
> Constrain the Number of Occurrences of Elements in your XML Schema
>
> by Roger L. Costello
> August 5, 2005
>
> Constrain your Data!
>
>
> In this message I will argue that you should never create XML Schemas that
> permit an unbounded number of occurrences.
>
> There are two ways in XML Schemas to permit an unbounded number of
> occurrences. The first way is to explicitly state that you are permitting 
> an
> unbounded number of occurrences. For example, this declaration says that
> Bookstore can contain an unbounded number of Book elements:
>
> <element name="Bookstore">
>
>    <complexType>
>
>        <sequence>
>
>            <element name="Book" type="..." maxOccurs="unbounded"/>
>
>        </sequence>
>
>    </complexType>
>
> </element>
>
>
>
> The second way of permitting an unbounded number of occurrences is less
> obvious. Unboundedness occurs implicitly when you create a recursive
> structure. In this example there is no limit to the depth of the Section
> elements. That is, a Section can contain a Section which contains a 
> Section
> which contains a Section ...
>
> <element name="Section" type="SectionType"/>
>
>
>
> <complexType name="SectionType">
>
>    <sequence>
>
>        <element name="Title" type="..."/>
>
>        <element name="Section" type="SectionType"/>
>
>    </sequence>
>
> </complexType>
>
>
>
> Both of the above forms permit an unbounded number of occurrences. I
> recommend that you never use either form. That is, never declare an 
> element
> with maxOccurs="unbounded", and never declare a recursive structure. Below 
> I
> explain why.
>
>
> Writing a Journal Article? Your Word Count is Limited!
>
>
> The situation with specifying the number of occurrences of an element in 
> an
> XML Schema is analogous to the situation with specifying the number of 
> words
> authors can use in an article.
>
> Suppose that you want to write an article for a journal. How many words 
> can
> you use in your article? All journals have an upper limit on the number of
> words that you can use. Why don't the journals set the word limit to
> unbounded? Answer: there are editors that have to check the articles for
> correctness, readability, etc. The editors have limited resources (i.e.,
> time). Thus, it is necessary to limit the length of the article. Perhaps 
> at
> a later date the journal will increase the word limit (perhaps they hire
> some full-time editors). But they always have a definite upper limit. They
> never allow articles of unbounded length. The reason is because of limited
> resources.
>
>
> Error! Infinite Loop!
>
>
> The situation with specifying the number of occurrences of an element in 
> an
> XML Schema is analogous to an infinite loop in programming languages. Why
> are infinite loops deemed "bad" in programming languages, yet unbounded
> occurrences are embraced in data?
>
> Let's see why infinite loops are bad in programming languages. Suppose 
> that
> a program has a loop, and a computer begins to process the loop. It 
> requires
> a certain amount of resources (memory, cpu cycles) for the computer to
> perform one iteration. Two iterations will require a bit more resources.
> Three iterations require still more. ... Infinite iterations require
> infinite resources. Thus, infinite loops are bad because they require
> infinite resources.
>
> The situation is analogous with data. Consider the Bookstore declaration
> above. It declares that an unbounded number of Book elements are permitted
> within Bookstore. A program that must process XML instances conforming to
> the declaration must have the necessary resources (memory, cpu cycles). To
> process one Book element will require a certain amount of resources. To
> process a second Book element will require a bit more resources. A third
> book will require still more resources. ... Infinite Books require 
> infinite
> resources. Even though XML instance documents are always finite, the 
> schema
> indicates that there is a "potential" for an infinite number of Book
> elements. A program that is designed to process "any" XML instance 
> document
> that conforms to the schema must therefore have an infinite amount of
> resources.
>
>
> Okay, then what Value should I use for maxOccurs?
>
>
> "Suppose that I anticipate that Bookstore will never have more than 30,000
> Books, so I set maxOccurs='30000'. After some time the requirements change
> and BookStore now needs to be able to hold 35,000 Books. Won't I have to
> change the Schema every time my needs change? Wouldn't it be easier if I
> simply declared maxOccurs='unbounded'?"
>
> Answer: yes, you will need to change the Schema whenever your requirements
> change. Yes, it is easier to simply declare maxOccurs='unbounded'. But 
> don't
> do it! The number that you use for maxOccurs should be as big as your
> programs are willing and able to cope with, and no more. If at some point
> the number of actual books exceeds that number then they must either (1)
> extend your program's resources to handle the expanded number, or (2) 
> refuse
> to allow more books.
>
>
> Recap
>
>
> 1. Don't use maxOccurs="unbounded"
>
> 2. Don't use recursive constructions
>
> 3. Set maxOccurs to a number no larger than the amount of resources you
> have available
>
> 





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS