Hi Folks,
Scenario: You are designing an XML Schema for validating XML instances that contain Book data. Each Book element contains Title, Author, Date of Publication, and ISBN. Some Books have multiple Authors. In your current environment, in your current worldview, no Book has more than 10 Authors. So you constrain the Author element to maxOccurs="10":
<xs:element name="Author" maxOccurs="10" type="…"/>
But what if in the future there are Books with 100 Authors, then XML instances will fail validation. Should you set maxOccurs to unbounded?
No!
Here’s why:
You want to be informed when the world has changed, when the choices you made are no longer relevant. A world in which Books contain 10 times more Authors than you thought they would is a different world. You want to be informed of this. Your XML Schema was originally written with one world view, if validation starts breaking that means you have got to rethink the initial stuff.
There is an analogous situation in programming. Should you constrain the size of an array or make it variable length? Here’s what John Carmack says: (https://youtu.be/I845O57ZSy4?t=4005)
I'm kind of fond in a lot of cases of static array size declarations. I went through this period where we should just make everything variable length because I had this history in the early days where Doom had some fixed limits on it and then everybody started making crazier and crazier things and they kept bumping up the different limits -- this many lines, this many sectors -- and it seemed like a good idea that we should just make it completely generic so it can go up to whatever. There are cases where that's the right thing to do, but the other aspect of the world changing around you is it's good to be informed when the world has changed more than you thought it would. If you've got a continuously growing collection, you're never going to find out. You might have this quadratic slowdown on something where you thought “Oh, I'm only ever going to have a handful of these,” but something changes and there's a new design style and all of a sudden you've got 10,000 of them. So I kind of like in many cases picking a number, some nice round power of two, and setting it up in there and having an assert saying “Hey, if you hit this limit, I need to know.” When that occurs, you should probably think: “Are the choices that I've made around all of this still relevant if somebody's using 10 times more than I thought they would? This code was originally written with this kind of world view, with this kind of set of constraints, and I was thinking of the world in this way.” If something breaks that means I’ve got to rethink the initial stuff.