RE: [xml-dev] How to design XML to have broad utility and yet also enabl

The war between complex and simple and re-use, leads to componentization. Data Model components are in effect the sub-routines of data exchange. One big model or 10 component models…it is possible that the 10 components may be less efficient than the One True Model, but it is certain that when things change, it will be easier to drop one component, and to define and add another to the mix.

But of course, solving *that* or even better avoiding that evolutionary complexity leads back to namespaces, much despised on this list.

"If something is not worth doing, it`s not worth doing well " -- Peter Drucker

Toby Considine
TC9, Inc

OASIS TC Chair: oBIX & WS-Calendar

OASIS TC Editor: EMIX, Energy Interoperation

SGIP Smart Grid Architecture Committee

Email: Toby.Considine@gmail.com
Phone: (919)619-2104

http://www.tcnine.com
blog: http://www.NewDaedalus.com

From: Peter Hunsberger [mailto:peter.hunsberger@gmail.com]
Sent: Wednesday, November 20, 2013 10:34 AM
To: Rick Jelliffe
Cc: xml-dev@lists.xml.org
Subject: Re: [xml-dev] How to design XML to have broad utility and yet also enable efficient application processing?

Hi Rick,

I think if you have a data model in mind then you have at least one application that you are expecting will use that data model. So, yeah, I'd agree, but with the caveat that data models may have broader coverage than single applications and the trade off between generalization for use with many systems and optimization for use with a few systems isn't always as straight forward as one might hope. Again, most of my focus is at the "enterprise" level where reuse is emphasized, but my continuing guidance when one attempts to justify an application specific interchange format is always "avoid premature optimization". In particular, the human factors of having to maintain many simple specs versus a few more complex specs may be a one time cost or an ongoing tax on the organization. The one time implementation for the generalized, but complex, spec might win or loose when it turns out it can't be reused after all... Ideally, we aim for broad scope and low complexity (at least relative to the problem domain), but generalized data specifications with low complexity (and the systems that handle them well) are rare beasts indeed. When they do show up they have quick uptake. That's sort of the war between XML and JSON, though at the syntax level as opposed to the interchange of specific sets of information...

Peter Hunsberger

On Tue, Nov 19, 2013 at 8:00 PM, Rick Jelliffe <rjelliffe@allette.com.au> wrote:

I'd go further than that: I'd say always optimize for at least one application (or source or platform etc) where it is known. -- At least you first look at this to see if it is appropriate.
The documents have to be good at something rather than good for nothing. If it can cope with one specific use, it may cope with others (as far as completeness etc).
The intent is that you only need glue code on one side of the exchange rather than on both. And your documents can then share the documentation of one side, rather than multiplying analyses.
People already do this a bit: no-one invents New table languages for example.
I think this is sometimes a major flaw in the utility or theory of namespaces: that you want standard vocabularies made independently of input or output (ie application specific information).

Cheers
Rick

On 20/11/2013 2:08 AM, "Peter Hunsberger" <peter.hunsberger@gmail.com> wrote:

David,

that's a good point. If the data interchange is between two points where you have knowledge of the requirements of both then, yeah, optimize for that interchange. And again, that may be multiple programs or applications....

I got distracted by Rogers use of the world "models" and being stuck on data modelling a lot these days that's the only thing I focused on. Time for more coffee...

Peter Hunsberger

On Tue, Nov 19, 2013 at 8:52 AM, David Lee <dlee@calldei.com> wrote:

I argue that regardless of if the XML is for "data" or "documents" ( whos intersection is not the empty set, IMHO),

that in fact the XML model can be quite coupled to the application and may need transformations even if the data is generic or abstract.

A simple example.

Very frequently large datasets are stored in XML documents as horrendously large single documents with vast replication of one or more child elements like

<root>

<data> ...

... a million times over

</root>

This is very convenient for some applications but horrible for others.

For example this may produce a file simply too big to read into some applications or is non ideal for an XML Database.

But it may be ideal for streaming processing, file transfer and packaging.

Transforming this file into different formats (say for example splitting it into a million smaller docs) may be better for some applications.

Similarly simple transformations may help with some applications such as combining fields, moving attributes elements or visa-versa ...due to pecurlaritities

of the application. You may even be able to translate "Proprietary Schema A" into "Universal Schema B" so that Application B can read it.

So I conclude that it is not the case that universally "data XML" should tuned for a single application and not translated.

Not all applications (in both senses of the word) of Data are alike even if the underlying data is alike.

----------------------------------------

David A. Lee

dlee@calldei.com

http://www.xmlsh.org

From: Peter Hunsberger [mailto:peter.hunsberger@gmail.com]
Sent: Tuesday, November 19, 2013 9:38 AM
To: Costello, Roger L.
Cc: xml-dev@lists.xml.org
Subject: Re: [xml-dev] How to design XML to have broad utility and yet also enable efficient application processing?

If XML is being used for document interchange then your XML design is predicated by the document formatting and content you wish to capture in your document. However, you seem to be aiming this at the data interchange world? If so, the question is the question you seem to be circling around is; should XML data interchange formats directly reflect the data models they are transporting? Given that this is data and not, therefore, an end product intended for humans I'd vote that the XML design should come after the data models are optimized for their various business purposes. The XML will then hopefully be as efficient of serialization of those models as possible. Note that, in my opinion, good data models are also not optimized for, or specific to, any one program. As I've noted before, _good_ data models span enterprises, least of all individual programs...

As to your last question, I certainly don't think applications should transform XML into forms that make it inefficient to process (duh)!

Peter Hunsberger

On Tue, Nov 19, 2013 at 4:41 AM, Costello, Roger L. <costello@mitre.org> wrote:

Hi Folks,

Liam Quin wrote:

XML frees your information from being
optimized for, and specific to, any one
program.

An obvious question arises:

What is the right way to design XML?

I believe the answer is:

Design XML so that it reflects (models)
the real world.

But real-world designs (models) may not be well-suited to efficient application processing. A second question arises:

How do we design XML to have broad
utility while still enabling efficient
application processing?

I believe the answer is:

Each application should transform the
XML into a form that enables the
application to process it efficiently.

Do you agree with this?

Is there anything else you would add?

/Roger

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php