OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Constrain the Number of Occurrences of Elements inyour XML

[ Lists Home | Date Index | Thread Index ]

I'd argue that there are two kinds of constraint here.  One is the 
theoretical data model, and I'm inclined to agree that too much detail 
there is a bad thing.  The second constraint is a statement of what the 
actual system is willing to, or can do.  We need both kinds of 
constraint but are forced to express them in using the same attributes 
and that is where the problem arises, a conflict between system 
specification and data model.

The second type of constraint, the practical capacity constraint, is 
important if we are to distribute and/or enforce statements about 
performance and capacity.  For example I may regard maxOccurs=3 
differently to maxOccurs="unbounded".  Specifying unbounded may be true 
for the data model when 3 is the real operational number.  The example 
of the number of <p> elements that Mozilla can handle illustrates this.  
If, in some hypothetical system, I have some hard browser UI performance 
requirement, that requirement will be on specified hardware with 
specified software and I will feel free to specify the number of <p>s 
based on that, regardless of the data model for the English language 
saying that maxOccurs is unbounded for paragraphs.  The idea that we 
have to handle anything that is thrown at us means either that we have 
no performance constraints, no software cost constraints, or that we 
have to have any amount of capacity sitting around waiting for 
arbitrarily large documents. 

There was an example of image processing given earlier, talking about 
TIFFs and file sizes, but in any production system there should be 
something that says that the file format is TIFF AND also something that 
says that the maximum size is 1GB (or whatever it is).  Businesses 
cannot undertake to process things that consume arbitrary amounts of 
system resource.

The problem of public standards is an interesting one and experience 
with EDI is worth looking at.  Typically the basic EDI message type is 
constrained further, outside of the EDI specification, by a local, 
partner specific profile, before it can be used for interchange.  A few 
months ago I was looking at the EDI profile for interchange with a very 
large retail chain.  This profile specified a subset of the formal 
standard: they specified exactly how many occurrences of what were 
acceptable and what literal values were acceptable where.  This wasn't 
theoretical data model stuff, this was "do it this way and we will do 
business with you".  Isn't this what we need to be able to support in  
XML?  If we don't put these limits in the schema, they just have to go 
somewhere else, somewhere less visible, less maintainable, and with less 
tool support. How SHOULD we do this if we aren't using the schema for 
validation of these constraints?

Joe English wrote:

>Roger L. Costello wrote:
>>Below I have jotted down a few thoughts regarding XML Schemas which permit
>>an unbounded number of occurrences.  Namely, I recommend against using
>>maxOccurs="unbounded" in an XML Schema.  I am interested in hearing your
>>thoughts on this.
>My thoughts lead to the exact opposite conclusion:
>you should never use anything *except* maxOccurs="unbounded"
>(or maxOccurs="1") in a schema.
>With very few exceptions, any attempt to devise a suitable
>upper bound for any 'maxOccurs' value is bound to involve
>wild-ass-guessery.  How many paragraphs should one allow
>in an HTML document?  You can take this from a business logic
>standpoint ("what's the longest web page anyone is ever
>going to want to produce"), or a processing standpoint
>(how many <p> elements can Mozilla cope with?  What about
>MSIE?  Does your answer change depending on how old the
>user's computer is?), but you'll never be able to come up
>with a satisfactory number.  Whatever number you choose
>will either be too large as a meaningful resource constraint,
>or it will be too small for some existing or future document.
>What you advocate is reminiscent of the QUANTITY and CAPACITY
>sections of the SGML declaration.  These were a perpetual annnoyance
>(the SGML declaration was the first thing that got dumped
>when XML was being designed), and as far as I know they
>never did anybody any good (i.e., they were never an accurate
>indication of how large a document any particular application
>could actually handle).
>>1.	Don't use maxOccurs="unbounded"
>>2.	Don't use recursive constructions
>>3.	Set maxOccurs to a number no larger than the amount of resources you
>>have available
>I'd argue the exact opposite, mostly because (3) is in
>practice impossible to answer, and rarely worth even trying
>to answer in the first place.
>There are only three sensible cardinalities: zero, one, and many.
>There are only four sensible cardinality constraints: mandatory,
>optional, mandatory+repeatable and mandatory+optional.
>(Corrolary: "?", "+", and "*" operators as found in DTDs and Relax NG
>are far more appropriate than WXS' separate minOccurs= and maxOccurs=
>--Joe English
>  jenglish@flightlab.com
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>To subscribe or unsubscribe from this list use the subscription
>manager: <http://www.oasis-open.org/mlmanage/index.php>


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS