Re: [xml-dev] Re: XML As Fall Guy

On Thu, Nov 28, 2013 at 12:51 PM, <cbullard@hiwaay.net> wrote:

An excellent response, Kurt, and thank you.

Well, friends, what lessons are there here?

len

Quoting Kurt Cagle <kurt.cagle@gmail.com>:

There's a lot of good content in this thread, and I think its important to
continue, because it susses out one of the other aspects of such big
waterfall productions (even when the shop involved is ostensibly using
scrum :-).

"Unfortunate things happened."

In corporate speak, this is the closest you'll ever see to an admission of
guilt. A simple assertion that errors occurred, without any subsequent
explanation about who was responsible, what those errors were, why they
were made, why they weren't fixed and so forth. Admission of liability
means that you assume that liability, which in turn means that your
shareholders don't get their dividend checks this month and they stop
investing in your company. That's also human nature, but one consequence of
it is that people never learn from the mistakes.

As I said earlier, I have to be very circumspect myself in what I'm saying
right now, as this has the potential to get into legal territory, so if it
seems like I'm speaking in generalities, its because I am trying to pick my
words carefully.

Building a common data model framework at the outset is critical. The
details of the individual components don't need to be worked out at this
point, but understanding the relationships between components is a
necessary first step before you can even begin architecting a solution, let
alone start writing code. What's more, *agreeing* on how this information

is going to get expressed, what set of rules you're going to use to
determine the characteristics of that data model, is even more important.
If one part of an organization is using NIEM, another using EDIFAC
standards, a third just making up models on the fly (using schemas that
they found doing Google searches on the web), then you will have trouble
when these silos have to communicate, because the framework of how you
establish the models can dictate how the model is put together (and what
you can express).

Agreement on global standards also apply to resource identification. This
was not a complex project, but it included some critical wrinkles - it is
distributed, meaning that you cannot count on internal identifiers
generated by one database being uniform across the whole system. It
involved creating proxies - I want to shop for insurance that fits my price
range and eligibility requirements, but I do not want the government, my
employer, my neighbor to know who I am until I have committed to a plan
(surprising how big a factor this one was). You have to identify plans and
policies and providers and agencies and exchanges and children and
employers and ... indeed, this was largely a resource mangement problem.
Treat each of these as separate binaries working in their own cogs and you
will run into problems when two different vendors implement these classes
in different ways then fail to communicate those difference in behaviors to
other embers of that exchange. This means that you have to keep your code
modeled in a dynamic system and your resources need to be treated
abstractly. Good XSLT developers know this problem intrinsically, and also
understand that unless you have a consistent global identification
mechanism, you have no *context* on which to build.

Information is never complete, and this means that you have to persist
resources over time. Income needs to be verified with the IRS, immigration
status needs to go to DHS, eligibility needs to be determined. You have to
know whether a person already applied for a program and was rejected, need
to know when that person becomes eligible for programs they weren't
eligible for before and so forth, and this means that over time you end up
building a painting of that person in your data system. You can hide
certain information, of course, and for privacy reasons you have to, but
that makes the ability to maintain some handle to that person and
persistence on that person more critical, not less. Building a messaging
system without understanding that those messages basically are changing the
state of knowledge about a given person, plan or requirement is similarly
critically questionable. It's much like building railroad tracks and a
network without worrying about little things like gauge widths or whether
you're transporting shipping containers vs. nuclear waste.

Understanding the distinction between data and application layers (and what
those things really mean) is also important. I think one reason that
programmers (especially Java programmers) have so much problem with
MarkLogic (or eXist, or increasingly a lot of NoSQL systems) is that there
has been a steady shift in thinking over the last twenty years away from
the notion that databases only hold content to the notion that there are
certain tasks that can be better accomplished by a dedicated application
layer sitting above the actual data storage system than can be done by
ad-hoc programming processes outside the system. Data validation,
transformations, content processing pipelines, rules based action systems,
reporting, auditing and versioning, notification, translation, packaging
and depackaging, inferencing, content enrichment, serving web application
content, and so forth ... all of these things are implemented in MarkLogic,
and are increasingly becoming "standard" features for NoSQL systems in
particular.

These are "application layer" things in the sense that they are not
actually involved with the database activities of queries or indexing, but
because they are within a "database" server, there is a tendency among Java
developers to want to ignore this facility and "roll" their own, because
everyone knows that Java is better at all these things (not taking into
account the fact that any Java process has to automatically add the
expensive tasks of data serialization and parsing via a JAXB representation
into objects that are heavily requirement upon an extant schema that often
is not written with object serialization in mind and that may change daily
in development realities.

Indeed, the worst possible way to use MarkLogic or most NoSQL data systems
is unfortunately the way that most developers use them - for storing and
retrieving static XML. If you never query your XML, if you never need to
transform your XML, if you never need to worry about versioning or
archiving or validating it, then you're better off just storing the XML in
a file system. Of course, this puts the onus of doing all of this on you as
the developer, and after a while you get complex rube goldberg like systems
that spring up because there are always exceptions and because you treat
this XML different from that XML (and you treat it differently than the
developer in the next cubicle). You place a high reliance upon schemas,
despite the fact that schema creation is itself a remarkably fine art, the
number of people working with XSD 1.1 and constraint modeling is still
vanishingly small, and most effective schemas are polyphasic - validation
changes based upon workflow context.

By the way, I would say that the same thing applies to JSON. JSON has a
different core data model, but from a processing standpoint, the value of a
JSON database comes in its ability to query, validate and transform its
content. MarkLogic is a pretty decent JSON store as well, by the way,
though I'd like to see a CoffeeScript layer built into MarkLogic at some
point just so that its more accessible to _javascript_ developers.

If you then add into this the fact that when you take the above pieces out
of the mix of Java development requirements, you remove a lot of the real
need for Java developers in the first place. When your business model is
predicated on employing large number of developers that can be charged at
inflated rates because it is a time and material government contract that
you believe will not be heavily scutinized because it advances an
administration priority, MarkLogic can be seen as a real threat.

Now, I'm a consultant, but I believe that you create value (and get more
work) by delivering value quickly and thoroughly, something that I believe
my consultancy generally follows as well. It's why I look for tools that
allow me to provide that value, so that I can concentrate on the harder
problems of developing cohesive information strategies for my clients.
MarkLogic for me is one of those tools, and for all that I can be
occasionally critical of specific decisions MarkLogic makes, I still would
recommend them unreservedly because they have solved so many of the
problems that frankly are data-centric or data-stream-centric, including a
lot of the system integration problems that seem endemic to every
organization I've worked with.

I think on this list in particular there is a tendency to ask - Was XML at
fault in a project like this? If you do not understand data modeling or
data design, effective requirement requirements, if you don't understand
XML or data interchange or NoSQL databases, and if you have a vested
interest in not knowing these things, then yes, XML was at fault, though by
that same logic Java was at fault.

Kurt

Kurt Cagle
Invited Expert, XForms Working Group, W3C
Managing Editor, XMLToday.org
kurt.cagle@gmail.com
443-837-8725

On Wed, Nov 27, 2013 at 10:28 AM, <cbullard@hiwaay.net> wrote:

I look for low hanging fruit. �For example, we see this big IT project
where they "say" they want to integrate systems. �What is the n of
integration? �IOW, start with existing forms. �Do they really mean to put
all of the data these
capture into one honking relational system OR do they want to capture the
same data and put it somewhere.

1. �They really want the big bad relational system. �Big dollars. �Much
business rule capture, much sorting and crunching down the names to fields
and data types, much discovery of when is B = A OR B == A, much how do I
decide when a name is a person or a person is a name, etc. �We know the
drills. �Charge X.

2. �The only want to get this data into the same place and get rid of the
paper. �AHA! �Write XML docs that replicate forms, output PDF, tell PDF it
is a form and let Acrobat Pro do its voodoo that it does so well, clean up
the names and check the XML export (just in case I need it but I might
not), write a web page to download forms/PDFs, write a web page to upload
forms. �Write a few more pages for finding forms. �(Never said a word about
validating did they? Good. � If they do, I'll renegotiate and start working
on the names in those data bags they've been collecting.)

IOW, if you don't have to do it all upfront, don't. �Plan to do some of it
later and make sure it can be done. � Some people don't want to replace
paper; they want to quit killing as many trees. �They don't want to
reengineer all of their processes. �They want to quit carrying paper back
and forth from a flight line and stuffing in a desk and trying to find it
later. �They have these neat little tablets but they don't need them to
think for them. �They just don't want to scribble.

The 99% problem is a problem of requirements. �The better you are at this
the better you can talk to the folk who don't understand the details by
talking to them about what they do know and then making damm sure you can
turn around and tell the folks in the basement precisely what to build.

And if the customer begins to play wrong rock right rock, take a contract
lawyer to the meeting.

len

Quoting Rick Jelliffe <rjelliffe@allette.com.au>:

�The enquiry into the Queensland debacle, where a $4 million project

spiralled into maybe a $400 million mess, found that apart from
personalities and capabilities and execution, the problem was the "shared
services" myth.
This says that if you have a few dozen disparate systems doing much the
same thing, you should centralize. The trouble being, in reality, that if
the disparate system all represent variant functionality, with not much in
common, your customization effort can be much bigger than your platform
effort.
You still have to find and replicate all those differences: to me it is a
classic case where a little bit of waterfall would be prudent: reverse
engineer the current system before you design the next! At least then you
know where you are up to...
The government "shared services" efforts here in Australia have all been
pulled back: the premise so often being wrong that the failure was
guaranteed.
As Gareth says, the US problem sounds similar: the problem being to cope
with multiplicity not commonality, if I can paraphrase Kurt's early post.

Rick
On 27/11/2013 4:32 PM, "Gareth Oakes" <goakes@gpslsolutions.com> wrote:

�> On 27/11/2013 7:37 am, "cbullard@hiwaay.net" <cbullard@hiwaay.net>

wrote:
>
> Simon sez: "Whatever the underlying story, I suspect we'll be dealing
> with the reverberations from this for a while."

Agree with this. The one thing for sure is that MarkLogic now has a big
headache and a lot of damage control to worry about :)

I think you'll find that now the spotlight is on those responsible for
the
Healthcare.gov project, they'll be grasping at any straw they can to
spread
the blame around. �I'll bet that in the application they are using it
for,
MarkLogic is a sound technology choice.

My conclusion: this smacks of a typical "big business" technology
acquisition.
We had an analogous problem over here in Queensland, leaving a $1.2B hole
(search: Queensland Health payroll disaster). The somewhat amusing upshot
being that IBM is currently banned from doing new business with the
Queensland
government.

I wonder how much better these types of projects could go if a more
incremental (Agile) approach was taken. The beauty of XML is that if you
set
the systems up right it can give you amazing flexibility and power
combined.

Cheers,
Gareth Oakes
Chief Architect, GPSL
www.gpslsolutions.com

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php