Lists Home |
Date Index |
At 03:00 24/07/2006, David Lyon wrote:
>On Sun, 2006-07-23 at 23:48 +0100, peter murray-rust wrote:
> > Is there anything essentially different between business data and
> > genomic data? They both need to be created, stored, transmitted,
> > processed and perhaps repurposed.
>I would not know. Business data is usually strongly typed these days
>into strings, numbers, booleans, currency values and so forth. I don't
>know if those are pressing issues for genomic data.
CML has been developed in exactly this way (apart from the currency
values). It is a collection of what might be called strongly-typed
microformats (though components is a better word for CML). It
supports generic numeric and scientific computing as well as chemistry.
> > At a general level they both
> > require a formal specification (Schema), maybe an ontology,
> > domain-specific tools for precessing them.
>Often Business data doesn't need a formal specification or schema.
>This requirement has really held back xml or at least kept it in the
>domain of tightly coupled systems. We need to go loose-coupling in
>future, not insist that a programmer has sat down beforehand and work
>out the schema for every single document.
I agree completely. I now really only use the Schema to define the
components, not how they are put together.
>Let me give you a real world example situation.
>Receptionist wants to type a shopping list. Must get schema created by
>IT. Loaded on a web server. Schema loaded on the web server. Validated.
>It is so complicated and requires so many resources that it just doesn't
>An easier way is to just embed all the type information and have no
>schema, no web server, nothing else. This can be typed:
> <Shopping List>
> Customer_Name&="Mr Fred Parker"
> <Product Item> PLU&="A256" Qty#=5 Rate$=4.56</>
> </Shopping List>
Somewhere, I assume, you have to encode the fact that Rate is in
dollars of some sort. Otherwise you cannot reliably convert it.
Therefore you have to have at least a partial schema or ontology CML
does this by having an extensible dictionary system and an extensible
system of units (which does not but could include currency).
Unilever Centre for Molecular Sciences Informatics
University of Cambridge,
Lensfield Road, Cambridge CB2 1EW, UK