There's a lot of good content in this thread, and I think its important to continue, because it susses out one of the other aspects of such big waterfall productions (even when the shop involved is ostensibly using scrum :-).
"Unfortunate things happened."
In corporate speak, this is the closest you'll ever see to an admission of guilt. A simple assertion that errors occurred, without any subsequent explanation about who was responsible, what those errors were, why they were made, why they weren't fixed and so forth. Admission of liability means that you assume that liability, which in turn means that your shareholders don't get their dividend checks this month and they stop investing in your company. That's also human nature, but one consequence of it is that people never learn from the mistakes.
As I said earlier, I have to be very circumspect myself in what I'm saying right now, as this has the potential to get into legal territory, so if it seems like I'm speaking in generalities, its because I am trying to pick my words carefully.
Building a common data model framework at the outset is critical. The details of the individual components don't need to be worked out at this point, but understanding the relationships between components is a necessary first step before you can even begin architecting a solution, let alone start writing code. What's more, agreeing on how this information is going to get expressed, what set of rules you're going to use to determine the characteristics of that data model, is even more important. If one part of an organization is using NIEM, another using EDIFAC standards, a third just making up models on the fly (using schemas that they found doing Google searches on the web), then you will have trouble when these silos have to communicate, because the framework of how you establish the models can dictate how the model is put together (and what you can express).
Agreement on global standards also apply to resource identification. This was not a complex project, but it included some critical wrinkles - it is distributed, meaning that you cannot count on internal identifiers generated by one database being uniform across the whole system. It involved creating proxies - I want to shop for insurance that fits my price range and eligibility requirements, but I do not want the government, my employer, my neighbor to know who I am until I have committed to a plan (surprising how big a factor this one was). You have to identify plans and policies and providers and agencies and exchanges and children and employers and ... indeed, this was largely a resource mangement problem. Treat each of these as separate binaries working in their own cogs and you will run into problems when two different vendors implement these classes in different ways then fail to communicate those difference in behaviors to other embers of that exchange. This means that you have to keep your code modeled in a dynamic system and your resources need to be treated abstractly. Good XSLT developers know this problem intrinsically, and also understand that unless you have a consistent global identification mechanism, you have no context on which to build.
Information is never complete, and this means that you have to persist resources over time. Income needs to be verified with the IRS, immigration status needs to go to DHS, eligibility needs to be determined. You have to know whether a person already applied for a program and was rejected, need to know when that person becomes eligible for programs they weren't eligible for before and so forth, and this means that over time you end up building a painting of that person in your data system. You can hide certain information, of course, and for privacy reasons you have to, but that makes the ability to maintain some handle to that person and persistence on that person more critical, not less. Building a messaging system without understanding that those messages basically are changing the state of knowledge about a given person, plan or requirement is similarly critically questionable. It's much like building railroad tracks and a network without worrying about little things like gauge widths or whether you're transporting shipping containers vs. nuclear waste.
Understanding the distinction between data and application layers (and what those things really mean) is also important. I think one reason that programmers (especially Java programmers) have so much problem with MarkLogic (or eXist, or increasingly a lot of NoSQL systems) is that there has been a steady shift in thinking over the last twenty years away from the notion that databases only hold content to the notion that there are certain tasks that can be better accomplished by a dedicated application layer sitting above the actual data storage system than can be done by ad-hoc programming processes outside the system. Data validation, transformations, content processing pipelines, rules based action systems, reporting, auditing and versioning, notification, translation, packaging and depackaging, inferencing, content enrichment, serving web application content, and so forth ... all of these things are implemented in MarkLogic, and are increasingly becoming "standard" features for NoSQL systems in particular.
These are "application layer" things in the sense that they are not actually involved with the database activities of queries or indexing, but because they are within a "database" server, there is a tendency among Java developers to want to ignore this facility and "roll" their own, because everyone knows that Java is better at all these things (not taking into account the fact that any Java process has to automatically add the expensive tasks of data serialization and parsing via a JAXB representation into objects that are heavily requirement upon an extant schema that often is not written with object serialization in mind and that may change daily in development realities.
Indeed, the worst possible way to use MarkLogic or most NoSQL data systems is unfortunately the way that most developers use them - for storing and retrieving static XML. If you never query your XML, if you never need to transform your XML, if you never need to worry about versioning or archiving or validating it, then you're better off just storing the XML in a file system. Of course, this puts the onus of doing all of this on you as the developer, and after a while you get complex rube goldberg like systems that spring up because there are always exceptions and because you treat this XML different from that XML (and you treat it differently than the developer in the next cubicle). You place a high reliance upon schemas, despite the fact that schema creation is itself a remarkably fine art, the number of people working with XSD 1.1 and constraint modeling is still vanishingly small, and most effective schemas are polyphasic - validation changes based upon workflow context.
By the way, I would say that the same thing applies to JSON. JSON has a different core data model, but from a processing standpoint, the value of a JSON database comes in its ability to query, validate and transform its content. MarkLogic is a pretty decent JSON store as well, by the way, though I'd like to see a CoffeeScript layer built into MarkLogic at some point just so that its more accessible to _javascript_ developers.
If you then add into this the fact that when you take the above pieces out of the mix of Java development requirements, you remove a lot of the real need for Java developers in the first place. When your business model is predicated on employing large number of developers that can be charged at inflated rates because it is a time and material government contract that you believe will not be heavily scutinized because it advances an administration priority, MarkLogic can be seen as a real threat.
Now, I'm a consultant, but I believe that you create value (and get more work) by delivering value quickly and thoroughly, something that I believe my consultancy generally follows as well. It's why I look for tools that allow me to provide that value, so that I can concentrate on the harder problems of developing cohesive information strategies for my clients. MarkLogic for me is one of those tools, and for all that I can be occasionally critical of specific decisions MarkLogic makes, I still would recommend them unreservedly because they have solved so many of the problems that frankly are data-centric or data-stream-centric, including a lot of the system integration problems that seem endemic to every organization I've worked with.
I think on this list in particular there is a tendency to ask - Was XML at fault in a project like this? If you do not understand data modeling or data design, effective requirement requirements, if you don't understand XML or data interchange or NoSQL databases, and if you have a vested interest in not knowing these things, then yes, XML was at fault, though by that same logic Java was at fault.
Kurt