RE: Dealing with lots of chunks of highly interrelated data?

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

From: "Costello, Roger L." <costello@mitre.org>
To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
Date: Sat, 11 Jun 2016 11:22:26 +0000

Thank you for your excellent responses! Indeed, there were some brilliant observations made, that I will reflect upon for a long time.

I found this statement by Eliot to be particularly profound:

** Understanding can be embodied in code. **

Here are a some of the key points that I learned:

1. [Dealing with lots of chunks of highly interrelated data] is handled today -- and has been for years -- by
relational database systems.

2. There are no simple rules for deciding which kind of tool or representation is best for a given situation.

3. "Vanilla XML" has no (useful) mechanism for representing relationships among XML elements other than hierarchical ones. (I'm ignoring the ID/IREF mechanism in XML because (a) it has no built-in semantics and (b) it's insufficiently complete as an addressing mechanism to even be the basis for more general link representation.) So basically any addressing mechanism will necessarily be a layer added on top of the base XML syntax, whether it's XPointer or URIs or something else.

4. Any use of XML to represent relationships among things will necessarily be a semantic application on top of the base XML content. But that is true of *any* non-trivial XML application.

5. Any non-trivial XML application always becomes a system of interrelated things. Whether you call it a hyper-document or a database is really just a matter of perspective or emphasis: fundamentally it is a set
of things linked together for some reason.

6. If you look at something like RDF or Topic Maps, they let you represent relationships and you can *state* any set of relationships you like to any
degree of precision you want. But they say nothing about how you *understand* the statements: understanding is the domain of knowledge
processing; that is, intelligence, artificial or otherwise.

7. Understanding can be embodied in code. That is, I can write data processing applications that understand a given set of relationships as applied to a known set of object types for a given purpose.

8. If the goal for a particular data set is to enable interchange and interoperation of the data *and the processing* then somebody will need to standardize the data model abstractions, the governing taxonomies and ontologies, the relationship type models, the population constraints, the business rules, etc. That is, they will need to define the application to a sufficient degree of detail that it is realistic to expect different implementations to provide compatible behavior.

9. XML Schema is not sufficient for validating this data (i.e., lots of chunks of highly interrelated data) because XML Schema has no mechanism (beyond XSD 1.1 assertions) to validate either the semantics of relationships (e.g., Is relationship of type X correctly applied to objects of type A and B with specific property sets K and J?). One can imagine such a language but it would start to look like a programming language...

/Roger

Follow-Ups:
- Re: [xml-dev] RE: Dealing with lots of chunks of highly interrelated data?
  - From: Rick Jelliffe <rjelliffe@allette.com.au>
- Re: [xml-dev] RE: Dealing with lots of chunks of highly interrelateddata?
  - From: Jim Melton <jim.melton@oracle.com>
- embodying understanding (was: Dealing with lots of chunks of highlyinterrelated data?)
  - From: Steve Newcomb <srn@coolheads.com>

References:
- Dealing with lots of chunks of highly interrelated data?
  - From: "Costello, Roger L." <costello@mitre.org>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]