Re: [xml-dev] [Summary] Should Subject Matter Experts Determine XML Data

Hi, Simon!

Yes, absolutely a mix is required - as well as an expectation on the part of both sides that data modelling is a political process, albeit one that requires some technical understanding of the context that the data model is used in. All too often, the SMEs want to create "universal" models, something that describes their space so completely that they can conceivably do anything with it ... and then wonder why the development process takes so long. I attribute this to a couple of factors. One of the most egregious is the inventor syndrome - if they build the universal model, then they'll get credit for that model (even if done in conjunction with a TE, who is only the engineer who worked to the SME's spec). I've had to talk down more than a couple of people who already could see themselves giving a paper about a widely used specification that they authored to a prestigious symposium.

The second factor is this belief that you must build your model as comprehensively as possible because its much harder to change things if you don't have this information. I think this is actually more of an issue for traditional applications than it is XML, because most traditional applications try to create a primary evaluation loop in order to handle as many potential situations as possible, and see exception handling as synonymous with error handling. On the other hand, XML programming (especially XSLT) takes the underlying assumption that you start with a primary use case, then build up additional exception templates that can be added in over time - everything except the most basic cases are exception handling. In this particular approach, it is in fact better to start out with a comparatively simple model, then apply success iterations to refine the model.

However, explaining the rationale for this approach is not always easy, especially in those situations where a schema already exists that has some political relevance - which seems to happen more often than not. In my experience, TEs are very seldom called into the design process at the very beginning, when it would make most sense to incorporate them. Instead, TEs are brought in as often as not to validate a given person's approach rather than suggest the best possible model, even though the SMEs have gotten into trouble because they didn't understand some critical design ramifications of a particular approach.

Perhaps another factor that contributes to friction between the SME and TE is that most SMEs do not understand data modeling, but because the elements involved in that model involve terms that are in the SMEs area of expertise, they feel that they should understand the modeling in that particular domain. This fallacy is a lot like saying that programming in C++ should be easy for English speakers because all of the terms are in English. Again, I can think of more than a few domain experts I've dealt with who were absolutely adamant that they knew exactly how things should be set up when they didn't have a clue.

On the other hand, the TEs aren't completely blameless here either. The role of a good ontologist is to determine what in fact needs to be modeled, then once that model's development process is underway, figuring out how such a model can be extended in a reasonable fashion without adding significantly to the complexity of the model. In short the TE needs to understand the domain reasonably well in order to model it, needs to be able to fashion intermediate applications that let people see how close the model actually approximates the real need, and needs to know when to say no to a client in order to keep the model from running outside the scope of the application it was intended for.

-- Kurt

On Sat, Oct 4, 2008 at 6:11 PM, Simon St.Laurent <simonstl@simonstl.com> wrote:

Kurt Cagle wrote:

In my experience designing ontologies for different groups, one thing that I find keeps cropping up is that SMEs tend to create data structures that most closely approximate their understanding of a subject, not necessarily that provides the most optimal representation of that data model. Certainly SMEs should be involved at all stages of the ontology process, but I've also found that if left up to the SME alone, the models are often awkward to implement, tend to be overspecified, and as often as not contain sometimes bizarre assumptions that can significantly limit these models when translated into a computing environment.

I have to agree with Kurt. Subject matter experts may well be experts at the subject, but making 'expert' data models work is not always a good idea. I'd probably go further, though.

Frankly, when building data models, I'd much rather have a mix of skill levels and perspectives involved. Different participants have different views on the data, but there's often more than just views on the same data model - there are often different internalized data models.

Combining those different models with data structures requires more than just careful data design. I'd argue it involves programming, transformations at minimum, that ensure that the data presented meets local expectations. That's never easy, but I don't think it's avoidable.

It's been a long time since I've been involved with this in an XML context (though I'm starting to work with it in a database context again), so I'm a bit cautious about saying this. Nonetheless, it seems so obviously true to me that I might as well.

Thanks,
Simon St.Laurent
XML retiree

--
Kurt Cagle
Managing Editor, xml.com
O'Reilly
kurt@oreilly.com