xml-dev - RE: [xml-dev] Data Oriented and Document Oriented Defintions

RE: [xml-dev] Data Oriented and Document Oriented Defintions

[ Lists Home | Date Index | Thread Index ]

To: 'Mike Champion' <mc@xegesis.org>, Xml-Dev <xml-dev@lists.xml.org>
Subject: RE: [xml-dev] Data Oriented and Document Oriented Defintions
From: "Bullard, Claude L (Len)" <clbullar@ingr.com>
Date: Thu, 18 Sep 2003 10:31:04 -0500

The problem with Paul's reply that although 
experientially relevant, it isn't formally 
true in any sense nor do you or Paul say it 
is, but that is what is being asked for.  

If data vs document were
true polarities, one might make a case for 
them, but they aren't because any document 
can have a mixture of features.  Really, the 
terms say more about how the input shapes 
the markup in the output, and then only weakly 
and usually in context of the processor.

I suppose the best one can say is 'oriented' 
but not 'determined'.  Otherwise, it is a polar 
distinction that should go away.

One at a time:

>"Document" XML is used to mark up narrative text
>intended to be read by humans; "data" XML is used to
>exchange database records intended to be processed by
>machines.

If I pull the tables out of an HTML file, I can give 
them to the machine without much modification.  That 
is why applications like Access can use them as data 
sources.  Again, the shape of the markup is 'oriented' 
by the source or destination.

>Instances of "document" XML are readable without the
>markup; instances of "data" XML are meaningless
>without the markup.

Not true in any sense.  It depends on the tag names 
to a some extent and on the background knowledge of 
the human in any extent.

>"Document" XML generally allows mixed content;
>"data" XML generally does not

Close as close gets but again, a habit of the 
processor, not the markup.

>The order of sub-elements almost always matters in
>document-oriented XML; in data-oriented XML it
>generally matters only for elements specifically
>identified as "lists" or something similar.

This is revealing but of what?  A relational database 
doesn't use ordering for addressing.  A technical 
manual can be written to be read by name or topical 
addressing, but isn't always a strictly orderered read. 
A novel typically is James Joyce not withstanding.  A 
haiku is. 

>Document-oriented applications can generally deal
>with unknown markup by removing the markup and keeping
>the content; data-oriented applications generally deal
>with unknown markup by ignoring the unknown markup and
>its content.

Which is a binding or coupling property of the markup 
to the application processor.

>[quoting directly from Prescod] "Data-oriented
>systems tend to prefer object types to be detectable 
>independent of context (thus namespaces) whereas
>document processing is typically done top-down
>recursively so relying on context is natural."

Again, a property of the implementation.  A human 
readable document can be strongly typed which is 
ontological.

The categories have so many holes, again, it is a 
distinction without much value apart from the 
context of the processor.   So the distinction 
I look for is how independent of the processor 
can the markup enable the content to be and 
still have semantic value.

len

Prev by Date: Re: [xml-dev] Data Oriented and Document Oriented Defintions
Next by Date: RE: [xml-dev] Web Services and Quality
Previous by thread: RE: [xml-dev] Data Oriented and Document Oriented Defintions
Next by thread: RE: [xml-dev] How to update an existing XML document
Index(es):
- Date
- Thread