xml-dev - Re: [xml-dev] dynamically generated XML Schema?! Re: [xml-dev] R:[xml-de

Re: [xml-dev] dynamically generated XML Schema?! Re: [xml-dev] R:[xml-de

[ Lists Home | Date Index | Thread Index ]

To: Peter Hunsberger <peter.hunsberger@gmail.com>
Subject: Re: [xml-dev] dynamically generated XML Schema?! Re: [xml-dev] R:[xml-dev] Number of active public XML schemas
From: Burak Emir <Burak.Emir@epfl.ch>
Date: Fri, 05 Nov 2004 16:06:10 +0100
Cc: XML Developers List <xml-dev@lists.xml.org>
In-reply-to: <cc159a4a041104071932930eb@mail.gmail.com>
Organization: EPFL
References: <F4AA32DE20F3D211BCB90050044B6DD0B7F761@CSB> <4188B213.50903@epfl.ch> <cc159a4a04110306405b43280c@mail.gmail.com> <4189EB28.3060508@epfl.ch> <cc159a4a041104071932930eb@mail.gmail.com>
User-agent: Mozilla Thunderbird 0.8 (X11/20040913)


Peter Hunsberger wrote:

>
>Let me put it this way, if someone needs an XML schema we can generate
>one.  In this particular application for 99% of the current needs we
>really don't need an XML schema at all.  That will change as things
>open up across more organizational boundaries.
>
>  
>
As I said before, half of your application sits across organizational 
boundaries...

Although it is a use case for the one who publishes the data, I am not 
sure whether there is a way to write a program that reacts to such a 
schema change and adapt its behavior automatically.

>>Where does that leave the receiver of your data? Two options
>>
>>1) Either, he cannot rely on any schema, because it may be subject to
>>complete change.
>>2) Or, the schema changes are actually very very restricted to a few
>>backwards-compatible details.
>>
>>Assuming the latter, I start seeing things clearer now, namely that if
>>you add a new complex type by derivation, you are effectively building a
>>new schema, hence there is indeed a new to build new schemas if it is
>>possible to "continuously specialize".
>>
>>Does this cover your requirement? If no, can you give a concrete example
>>like the one above?
>>    
>>
>
>Not really, the dynamic generation occurs at well defined points: the
>introduction of a new clinical trial or the revision of a medical
>protocol.
>
>  
>
This discussion is on the process level (interacting with humans), where 
my initial question was on the level of interacting software.

It seems to be a case of schema evolution.


<snip/>

>>I am aware of XML Schema pitfalls that prevent typed programming
>>languages (e.g. XSLT, XQuery) from using the specialized data, yet it's
>>hard to really grasp the need for "continuous
>>specialization/extension/adaptation".
>>    
>>
>
>I think in some ways it's part of the problem domain: we're doing
>research, by definition we don't have well defined business rules that
>can be evenly applied across all of the researchers.  None-the-less
>the researchers will wish to exchange data with each other in some
>well defined way.
>
>  
>
The only constant thing is change, also business rules are not cast in 
stone.

>Instead of proceeding top down with business rules to schema we have
>to build many possible solutions and dynamically search the solution
>space to see what fits at any given moment.  In a way it's a recursive
>data mining project to find what schema works to describe the data.
>Alternatively, perhaps it's a genetic algorithm for determining the
>fitness of the schema to the data. (Both of those characterizations
>are unfair, we actually have a better understanding of the data than
>they imply.)
>
>  
>
Both data mining and genetic algorithms talk on machines, you add humans 
to the equation.

The point I tried to make was more or less that if you generate a schema 
dynamically, then humans have to rewrite software. Meaning that the old 
software will not work. Your problem seems way beyond, you never claimed 
that old software will work.

>>>>        
>>>>
>>>Yes and no. We have a meta-schema.  It's so abstract and so
>>>generalized that it's difficult to use for specific instance data.
>>>The problem is, understanding of the schema is often local to the
>>>schema writer.  Not everyone "gets" 5th normal form, 5th normal form
>>>doesn't work when the data hit's the data warehouse.
>>>
>>>      
>>>
>>Does it happen that you need to change that one as well?
>>
>>Or is it a "parameterized" schema (like the Java generics)?
>>    
>>
>
>It is largely a parameterized schema though it is still being revised
>as we figure out what works best.  The biggest changes are a constant
>evolution to make it more granular.   It's becoming less and less like
>a conventional relational database schema (not that it ever was) and
>more and more like a graph management system.
>
>  
>
[OT] sounds like tricky stuff. Reminds me of a "professor for software 
engineering" whose only fascinations were ADA, Mercedes-Benz (as an ever 
repeating example of plain old industry in need for new software) and 
general graph replacement systems. I would never spend my time on a 
general approach to graph systems. For special purpose they can make a 
lot sense.

>>>>What is a use case for dynamically generated schemas?
>>>>        
>>>>
>>>For one, you need different schema for different stages in the life of
>>>the data. I know of no technology that lets you adequately describe
>>>all possible transformations of the schema over time from within the
>>>schema itself.  As a specific example (discussed previously on the
>>>list),  you need a way to match versions of the schema to work flow.
>>>
>>>      
>>>
>>In my understanding of the problem, this drifts away from "dynamical
>>generation". Schema evolution (or just backwards-incompatible change)
>>makes configuration management, versioning, and many things necessary.
>>
>>But having a meta schema and generating schemas is of no use for the
>>problem at hand, because the receiver of your data cannot write software
>>that deals with the meta schema, and hence with all versions of the schema.
>>    
>>
> 
>I guess this depends on your perspective: are the schema the starting
>point or the end point?  Do you negotiate from the schema or do you
>document the negotiations with the schema?  If it's the latter, how do
>you model and document before you have the schema?
>
>  
>
Negociations would mean "reconfiguration", and I precisely doubt that 
such a thing is possible (in absence of Meta-XSLT:-)

>If you have a good system for capturing the modelling and
>documentation when you are working at the business knowledge capture
>level then the schema can become an after the fact documentation
>artifact.  Yes, you still need version management, but the audit trail
>that documents the negotiations isn't based on the schema, it's
>external to it (and yes, maybe you have a schema for exchanging that
>data also).
>
>  
>
Surely, schemas do evolve, and having a documentation artefact is better 
than having none.

What I get out of the description is that probably no schema language 
and no fixed program would help here.

>>>>Why does one need to use XSL for it ?
>>>>        
>>>>
>>>You don't, but in our case, we've got about 8 different pieces of
>>>source metadata that have to be combined and transformed in order to
>>>derive a specific schema.  XSL is the best match to the problem I know
>>>of.
>>>
>>>      
>>>
>>Unless I have misunderstood, I think your problem seems rather
>>different, because you could also get away with not generating any
>>schema at all, if it can change it unanticipated ways. Your problem and
>>its solution (which may be elegant) does not take receivers into account
>>- they may have to hand  patch their code to deal with the new data.
>>    
>>
>
>The changes are anticipated, they occur at well document points in the
>life cycle of the protocol.  When the changes happen the receivers do
>indeed have to change the systems that accept the data.  We're working
>on ways to automate the process. The solutions are, in part, based on
>the exchange of schemas that document the changes... :-)
>
>  
>
To automate "fixing a program for the new schema", this is precisely 
what I think does not exist anywhere.

More specifically, even a statically typed bunch of XSLT stylesheet and 
XQuery programs cannot deal with a dynamical schema change.

-- 
cheers,
Burak Emir

http://lamp.epfl.ch/~buraq

Follow-Ups:
- Re: [xml-dev] dynamically generated XML Schema?! Re: [xml-dev] R: [xml-dev] Number of active public XML schemas
  - From: Peter Hunsberger <peter.hunsberger@gmail.com>
- unsubscribre
  - From: Johan Tilstra <johantilstra@hotmail.com>

References:
- R: [xml-dev] Number of active public XML schemas
  - From: Chizzolini Stefano <chist@csb.it>
- dynamically generated XML Schema?! Re: [xml-dev] R: [xml-dev] Numberof active public XML schemas
  - From: Burak Emir <Burak.Emir@epfl.ch>
- Re: [xml-dev] dynamically generated XML Schema?! Re: [xml-dev] R: [xml-dev] Number of active public XML schemas
  - From: Peter Hunsberger <peter.hunsberger@gmail.com>
- Re: [xml-dev] dynamically generated XML Schema?! Re: [xml-dev] R:[xml-dev] Number of active public XML schemas
  - From: Burak Emir <Burak.Emir@epfl.ch>
- Re: [xml-dev] dynamically generated XML Schema?! Re: [xml-dev] R: [xml-dev] Number of active public XML schemas
  - From: Peter Hunsberger <peter.hunsberger@gmail.com>

Prev by Date: Re: [xml-dev] XPath error with Xalan
Next by Date: unsubscribre
Previous by thread: Re: [xml-dev] dynamically generated XML Schema?! Re: [xml-dev] R: [xml-dev] Number of active public XML schemas
Next by thread: unsubscribre
Index(es):
- Date
- Thread