[
Lists Home |
Date Index |
Thread Index
]
Peter Hunsberger wrote:
>
>Let me put it this way, if someone needs an XML schema we can generate
>one. In this particular application for 99% of the current needs we
>really don't need an XML schema at all. That will change as things
>open up across more organizational boundaries.
>
>
>
As I said before, half of your application sits across organizational
boundaries...
Although it is a use case for the one who publishes the data, I am not
sure whether there is a way to write a program that reacts to such a
schema change and adapt its behavior automatically.
>>Where does that leave the receiver of your data? Two options
>>
>>1) Either, he cannot rely on any schema, because it may be subject to
>>complete change.
>>2) Or, the schema changes are actually very very restricted to a few
>>backwards-compatible details.
>>
>>Assuming the latter, I start seeing things clearer now, namely that if
>>you add a new complex type by derivation, you are effectively building a
>>new schema, hence there is indeed a new to build new schemas if it is
>>possible to "continuously specialize".
>>
>>Does this cover your requirement? If no, can you give a concrete example
>>like the one above?
>>
>>
>
>Not really, the dynamic generation occurs at well defined points: the
>introduction of a new clinical trial or the revision of a medical
>protocol.
>
>
>
This discussion is on the process level (interacting with humans), where
my initial question was on the level of interacting software.
It seems to be a case of schema evolution.
<snip/>
>>I am aware of XML Schema pitfalls that prevent typed programming
>>languages (e.g. XSLT, XQuery) from using the specialized data, yet it's
>>hard to really grasp the need for "continuous
>>specialization/extension/adaptation".
>>
>>
>
>I think in some ways it's part of the problem domain: we're doing
>research, by definition we don't have well defined business rules that
>can be evenly applied across all of the researchers. None-the-less
>the researchers will wish to exchange data with each other in some
>well defined way.
>
>
>
The only constant thing is change, also business rules are not cast in
stone.
>Instead of proceeding top down with business rules to schema we have
>to build many possible solutions and dynamically search the solution
>space to see what fits at any given moment. In a way it's a recursive
>data mining project to find what schema works to describe the data.
>Alternatively, perhaps it's a genetic algorithm for determining the
>fitness of the schema to the data. (Both of those characterizations
>are unfair, we actually have a better understanding of the data than
>they imply.)
>
>
>
Both data mining and genetic algorithms talk on machines, you add humans
to the equation.
The point I tried to make was more or less that if you generate a schema
dynamically, then humans have to rewrite software. Meaning that the old
software will not work. Your problem seems way beyond, you never claimed
that old software will work.
>>>>
>>>>
>>>Yes and no. We have a meta-schema. It's so abstract and so
>>>generalized that it's difficult to use for specific instance data.
>>>The problem is, understanding of the schema is often local to the
>>>schema writer. Not everyone "gets" 5th normal form, 5th normal form
>>>doesn't work when the data hit's the data warehouse.
>>>
>>>
>>>
>>Does it happen that you need to change that one as well?
>>
>>Or is it a "parameterized" schema (like the Java generics)?
>>
>>
>
>It is largely a parameterized schema though it is still being revised
>as we figure out what works best. The biggest changes are a constant
>evolution to make it more granular. It's becoming less and less like
>a conventional relational database schema (not that it ever was) and
>more and more like a graph management system.
>
>
>
[OT] sounds like tricky stuff. Reminds me of a "professor for software
engineering" whose only fascinations were ADA, Mercedes-Benz (as an ever
repeating example of plain old industry in need for new software) and
general graph replacement systems. I would never spend my time on a
general approach to graph systems. For special purpose they can make a
lot sense.
>>>>What is a use case for dynamically generated schemas?
>>>>
>>>>
>>>For one, you need different schema for different stages in the life of
>>>the data. I know of no technology that lets you adequately describe
>>>all possible transformations of the schema over time from within the
>>>schema itself. As a specific example (discussed previously on the
>>>list), you need a way to match versions of the schema to work flow.
>>>
>>>
>>>
>>In my understanding of the problem, this drifts away from "dynamical
>>generation". Schema evolution (or just backwards-incompatible change)
>>makes configuration management, versioning, and many things necessary.
>>
>>But having a meta schema and generating schemas is of no use for the
>>problem at hand, because the receiver of your data cannot write software
>>that deals with the meta schema, and hence with all versions of the schema.
>>
>>
>
>I guess this depends on your perspective: are the schema the starting
>point or the end point? Do you negotiate from the schema or do you
>document the negotiations with the schema? If it's the latter, how do
>you model and document before you have the schema?
>
>
>
Negociations would mean "reconfiguration", and I precisely doubt that
such a thing is possible (in absence of Meta-XSLT:-)
>If you have a good system for capturing the modelling and
>documentation when you are working at the business knowledge capture
>level then the schema can become an after the fact documentation
>artifact. Yes, you still need version management, but the audit trail
>that documents the negotiations isn't based on the schema, it's
>external to it (and yes, maybe you have a schema for exchanging that
>data also).
>
>
>
Surely, schemas do evolve, and having a documentation artefact is better
than having none.
What I get out of the description is that probably no schema language
and no fixed program would help here.
>>>>Why does one need to use XSL for it ?
>>>>
>>>>
>>>You don't, but in our case, we've got about 8 different pieces of
>>>source metadata that have to be combined and transformed in order to
>>>derive a specific schema. XSL is the best match to the problem I know
>>>of.
>>>
>>>
>>>
>>Unless I have misunderstood, I think your problem seems rather
>>different, because you could also get away with not generating any
>>schema at all, if it can change it unanticipated ways. Your problem and
>>its solution (which may be elegant) does not take receivers into account
>>- they may have to hand patch their code to deal with the new data.
>>
>>
>
>The changes are anticipated, they occur at well document points in the
>life cycle of the protocol. When the changes happen the receivers do
>indeed have to change the systems that accept the data. We're working
>on ways to automate the process. The solutions are, in part, based on
>the exchange of schemas that document the changes... :-)
>
>
>
To automate "fixing a program for the new schema", this is precisely
what I think does not exist anywhere.
More specifically, even a statically typed bunch of XSLT stylesheet and
XQuery programs cannot deal with a dynamical schema change.
--
cheers,
Burak Emir
http://lamp.epfl.ch/~buraq
|