OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Constrain the Number of Occurrences of Elements inyour XML

[ Lists Home | Date Index | Thread Index ]

No, I am arguing that the practical capacity/performance limitations are 
different to the theoretical data model limitations and that we need to 
express both.  The capacity limitations of significant systems are not 
transitory, they hang around for a long time and are based on SLAs 
established for those systems on well defined hardware.  SLAs that cost 
money when they are not met.

The majority of XML standards activity or B2? XML application 
development is not aimed at things that are likely to get processed on 
my son's desktop machine or on my laptop, it is aimed at larger servers 
and higher transaction rates.   What I am saying is that the majority of 
XML standardisation activity or bespoke code is not dealing with HTML or 
RSS and so <p> elements or <item> elements are not the best examples, 
better examples would be invoice line items, participant addresses or 
criminal charges, things that may (and usually do) have strong 
performance criteria associated with them.  That said, lets run with 
your example for a minute. 
Sure, the performance of internet client hardware and software rises 
over time, but usability guidelines for webpages still specify an upper 
limit on page size which is not changing at the rate that Moores law 
would suggest (the US is still around 50% dial-up internet access, I am 
extrapolating from comments by Neilsen last year) and in any case 
internet client hardware is not really where the majority of business 
XML processing takes place.
When we design websites we (should) identify a target page size and 
communicate that to our developers and that then places an implicit 
limit on the number of elements in the document (the Microsoft homepage 
is about 20k of HTML).  The cost associated with web page access is 
largely network transit time so physical page size is a good measure of 
performance for most pages, and in any case if a page doesn't load fast 
all it does is annoy the users.  The processing cost associated with XML 
documents in a business context is more likely to be tied to the 
elements in the document (who cares, within limits, how long an invoice 
takes to arrive? but businesses do care about getting through their 50k 
invoices per day) and taking too long to process them will cost someone 
money .  So why are we putting the page size/document element count 
outside of the system?  With HTML we have no choice and in any case, the 
page size is under our control because we construct it and send it out, 
so we design and test our own pages.  With XML we need to tell our 
partners what is acceptable so that we can process what they send to us 
as quickly as we say we will (see the example I gave of the EDI 
profile).  Those limits will not go away.  Major corporations will not 
change their behaviour.  The question is whether there will be tool 
support for those constraints and whether I have to write code to 
structurally validate something that already has its structure partially 
specified by a schema. 

If we don't have tool support for this, and I am thinking of structural 
validation, then we will have the same problems that occur with EDI.  
Lots of standards that are modified by their users in small ways that do 
not have consistent tool support, which ultimately reduces the usability 
and interoperability of those standard data structures.

The question from my earlier email is still there:

If we don't put these limits in the schema, they just have to go 
somewhere else, somewhere less visible, less maintainable, and with less 
tool support. How SHOULD we do this if we aren't using the schema for 
validation of these constraints?


Dare Obasanjo wrote:

>You see to be arguing that transient system limitations (memory constraints, CPU processing power, etc) should end up making their way into schemas of formal specifications. Using that argument what should have been the limit on <p> elements in HTML or <item> elements in RSS based on the typical machine's processing power from the 1990s? After putting those limitations in their schemas would we have had to rev them every 18 months to account for Moore's law? 


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS