XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] The impact of data format selection on application development

Roger, hello.

On 12 Jul 2022, at 14:57, Roger L Costello wrote:

>> as has already been discussed in this thread, there's significantly
>> more to CSV or TSV than meets the eye (escapes, line-endings,
>> and so on), so this is 'simpler' only for a recipient who has seen
>> this before and knows what to do.  _In that context_, the data
>> description is short, and appears simple.
>>
>> So 'simple' data formats are actually 'high-context' data formats (compare [1]).
>>
>> Note that in that description I didn't mention that I'd expect the integer
>> _not_ to include a comma (which is useful only for display, and which
>> would conventionally be regarded as hostile in a transmission format),
>> and I did choose to add a little explicit context in mentioning the units
>> of the second column (you _did_ mean km, didn't you,... hmm?).  So
>> I've thoughtfully chosen what context to make explicit, and expected
>> that the recipient of the description will know the Right Thing To Do.
>
>  I think that Norman is asserting that XML provides more contextual information and therefore is a superior data format. XML is a high-context data format. I would like to challenge that assertion. Consider this highly plausible XML representation of the data:

I expressed myself badly, I think, because I certainly didn't intend to say 'superior', there.

The high/low-context culture idea (which is well known but not necessarily uncontroversial as an idea) is that 'high-context' cultures _share_ more context between individuals, so that less needs to be said in a particular interaction, because more can be taken for granted via the context; whereas 'low-context' cultures are typically more explicit in conversation (or transactional, or insert adjectives to taste).

So I'm partly disagreeing with the 'simple' label applied to TSV, on the grounds that I think that simplicity is only identifiable from the point of view of those who share the context.  Thus TSV is simple only in a high-context shared culture which may or may not be identified with the 'unix philosophy'.

On this point of view, XML is a 'low-context' thing, because, as you illustrate, a lot more of the context has to be spelled out, and XML is a good way of spelling that context out, even though that spelling out requires a significant up-front investment of time and understanding.

So (I think) I'm saying two things:

  1. the simplicity of TSV is not uncomplicated, and in some senses might be an illusion; and separately
  2. 'simple' structures break sooner.

The 'unix philosophy' is perhaps a way of expressing, and teaching, the shared context that makes TSV seem a 'simple' format, and your 'plan to throw tools away' is one way of dealing with point 2.

> But what is different about the super simple data format is that the explanation of its data format (a series of lines, each line containing two fields) is immediately discerned.

...by those who share a context.

This is a long way of saying that, to the extent that the original question is 'which format is better?', that question is unanswerable, because the answer 'simple tools are better' is bringing a _lot_ of freight in with the word 'simple'.

Best wishes,

Norman


-- 
Norman Gray  :  https://nxg.me.uk
SUPA School of Physics and Astronomy, University of Glasgow, UK


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS