XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] RE: Dealing with lots of chunks of highly interrelateddata?

Roger,

I'd like to add a comment at a much higher level, if I may.

There is always a temptation to ask questions such as "What is the best way to handle X?" or "What is the best tool for Y?" Those questions are almost always the wrong questions unless you are dealing with very small problems or projects.

There are many influences that will affect the possible answers to such questions, and most of them are not obvious. Off the top of my head, here are few of the more important:

* Is the project a one-off or a product to be sold commercially?

* What human resources are available in the near term and the long term, and what training, skill sets, and temperaments do they have?

* What is the deep underlying purpose of the data involved, and what is the long-term goal for use (manipulation, gathering, distribution, reporting) of that data?

* What is the growth rate of the data, the semantics, and the requirements?

* How thin is the skin of the customer (i.e., will operating at the bleeding edge of technology cause them serious harm)?

* How stable are the financial resources?

* What are the risks if the approach chosen is found to be completely inadequate?

Without knowing a great deal about all of those influences (and many more), I would personally be very reluctant to start off by asking "What programming language should I use?" or "What data model is best?" I have watched helplessly as a $100million bet-the-corporation project started off trying to answer the programming language question instead of any of the more fundamental questions. The corporation no longer exists and over a hundred thousand people were put of work. And all I got was this tee-shirt saying "I told you so."

Jim


On 6/11/2016 5:22 AM, Costello, Roger L. wrote:
Thank you for your excellent responses! Indeed, there were some brilliant observations made, that I will reflect upon for a long time.

I found this statement by Eliot to be particularly profound:

** Understanding can be embodied in code. **

Here are a some of the key points that I learned:

1. [Dealing with lots of chunks of highly interrelated data] is handled today -- and has been for years -- by
relational database systems.

2. There are no simple rules for deciding which kind of tool or representation is best for a given situation.

3. "Vanilla XML" has no (useful) mechanism for representing relationships among XML elements other than hierarchical ones. (I'm ignoring the ID/IREF mechanism in XML because (a) it has no built-in semantics and (b) it's insufficiently complete as an addressing mechanism to even be the basis for more general link representation.) So basically any addressing mechanism will necessarily be a layer added on top of the base XML syntax, whether it's XPointer or URIs or something else.

4. Any use of XML to represent relationships among things will necessarily be a semantic application on top of the base XML content. But that is true of *any* non-trivial XML application.

5. Any non-trivial XML application always becomes a system of interrelated things. Whether you call it a hyper-document or a database is really just a matter of perspective or emphasis: fundamentally it is a set
of things linked together for some reason.

6. If you look at something like RDF or Topic Maps, they let you represent relationships and you can *state* any set of relationships you like to any
degree of precision you want. But they say nothing about how you *understand* the statements: understanding is the domain of knowledge
processing; that is, intelligence, artificial or otherwise.

7. Understanding can be embodied in code. That is, I can write data processing applications that understand a given set of relationships as applied to a known set of object types for a given purpose.

8. If the goal for a particular data set is to enable interchange and interoperation of the data *and the processing* then somebody will need to standardize the data model abstractions, the governing taxonomies and ontologies, the relationship type models, the population constraints, the business rules, etc. That is, they will need to define the application to a sufficient degree of detail that it is realistic to expect different implementations to provide compatible behavior.

9. XML Schema is not sufficient for validating this data (i.e., lots of chunks of highly interrelated data) because XML Schema has no mechanism (beyond XSD 1.1 assertions) to validate either the semantics of relationships (e.g., Is relationship of type X correctly applied to objects of type A and B with specific property sets K and J?). One can imagine such a language but it would start to look like a programming language...

/Roger

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

--
========================================================================
Jim Melton --- Editor of ISO/IEC 9075-* (SQL)     Phone: +1.801.942.0144
  Chair, ISO/IEC JTC1/SC32 and W3C XML Query WG    Fax : +1.801.942.3345
Oracle Corporation        Oracle Email: jim dot melton at oracle dot com
1930 Viscounti Drive      Alternate email: jim dot melton at acm dot org
Sandy, UT 84093-1063 USA  Personal email: SheltieJim at xmission dot com
========================================================================
=  Facts are facts.   But any opinions expressed are the opinions      =
=  only of myself and may or may not reflect the opinions of anybody   =
=  else with whom I may or may not have discussed the issues at hand.  =
========================================================================



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS