OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SAX Filters for Namespace Processing



8/4/01 9:52:24 PM, Tom Bradford <bradford@dbxmlgroup.com> wrote:

>Exactly... Tools.  It's no longer easily human editable, and no longer
>easily human readable.  But these were some of the promises of XML.  A

Wait a minute.  Does "easily" mean that somebody ought to be able to edit or read an arbitrary 
fragment of a document without understanding the document itself?  I don't think *anyone* ever 
promised that sort of "ease," as if it were even achievable.  Can one meaningfully edit a document 
written in a natural language that one doesn't understand?  I resent the implication that "ease of 
use" for a tool refers to the ability of someone who doesn't understand the *task* to use it.  
There's an old saying that if you design a system that an idiot can use, only an idiot would *want* 
to use it.  We all accept, for example, that the user of a word processor needs to understand that 
if he defines some terms in one paragraph, uses them in a subsequent paragraph, and then cuts-and-
pastes the latter paragraph to a point before the one that contains the definitions, he's messed up 
the document.  We don't declare a document "hard to edit" or "hard to read" simply because it's not 
robust to an arbitrary permutation of its contents (that is, unless we're academic postmodernist 
rebels looking for a cause).

>few years ago, we sold a bill of goods to the IT community.  We've
>delivered on it, but we've also introduced many unforeseen maintenance
>costs, that the buyers never planned on, and would have caused them to
>think twice, had they known in advance.  In that sense, the W3C isn't
>much better than a used car dealer.

If the IT community thought that XML would enable clue-free processing of data, that's its fault, 
not the XML community's.  What the XML community promised and what business journalists wrote about 
XML are two completely different things.

>It's even worse with XML Schemas because an element isn't just valid in
>the context of its parent.  Potentially the entire tree up to the root
>is the validating context.  In this case, based on the schema, one
>element named 'blah' might mean something completely different from
>another element named 'blah', even if they have the exact same parent
>element, and even if they're in the same namespace.  Quite the exception
>from DTDs, eh?  

Once again, meaning is determined by context in *any* language.  One of the characteristics of XML 
is that position-in-hierarchy conveys meaning.  If you don't like that, then remember that dots are 
acceptable name characters, so nothing's stopping you from using fully-path-qualified element type 
names.  You can write "MS Hungarian" XML if you so desire.  With enough effort, you could represent 
hierarchical data as a "flat" set (only looking like a list) of elements that could be arbitrarily 
permuted.  But then you'd have to manually renumber/rewrite all the names if you wanted to add or 
delete an item.  IMHO that would be "harder to edit" than a document where hierarchy conveys 
meaning.  And which is easier to read:

<contacts>
  <contact>
    <name>
      <first>Eric</first>
      <last>Bohlman</last>
    </name>
    <address>
      <number>3631</number>
      <street>S. Wallace Ave.</street>
      <city>Chicago</city>
      <US_state>IL</US_state>
      <US_ZIP>60609</US_ZIP>
    </address>
  </contact>
</contacts>

or, liberating the data from a linear reading,

<com.omsdev.contacts>
<com.omsdev.contacts.contact_1.address.city>Chicago</com.omsdev.contacts.contact_1.address.city>
<com.omsdev.contacts.contact_1.name.last>Bohlman</com.omsdev.contacts.contact_1.name.last>
<com.omsdev.contacts.contact_1.address.number>3631</com.omsdev.contacts.contact_1.address.number>
<com.omsdev.contacts.contact_1.name.first>Eric</com.omsdev.contacts.contact_1.name.first>
<com.omsdev.contacts.contact_1.address.US_ZIP>60609</com.omsdev.contacts.contact_1.address.US_ZIP>
<com.omsdev.contacts.contact_1.address.US_state>IL</com.omsdev.contacts.contact_1.address.US_STATE>
<com.omsdev.contacts.contact_1.address.street>S. Wallace
</com.omsdev.contacts.contact_1.address.street>
</com.omsdev.contacts>

?

A common characteristic of language is the use of shorthand, in one which trades the need to 
understand context for the need to deal with extensive verbiage.  Spelling everything out often 
makes a text *harder* to read rather than easier (though of course it depends on how you read it; 
if you're forced to read through a tiny peephole, spelling everything out is probably easier).