[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: SAX Filters for Namespace Processing
- From: Eric Bohlman <ebohlman@earthlink.net>
- To: Tom Bradford <bradford@dbxmlgroup.com>,Richard Tobin <richard@cogsci.ed.ac.uk>
- Date: Sat, 04 Aug 2001 11:59:59 -0500
8/4/01 9:52:24 PM, Tom Bradford <bradford@dbxmlgroup.com> wrote:
>Exactly... Tools. It's no longer easily human editable, and no longer
>easily human readable. But these were some of the promises of XML. A
Wait a minute. Does "easily" mean that somebody ought to be able to edit or read an arbitrary
fragment of a document without understanding the document itself? I don't think *anyone* ever
promised that sort of "ease," as if it were even achievable. Can one meaningfully edit a document
written in a natural language that one doesn't understand? I resent the implication that "ease of
use" for a tool refers to the ability of someone who doesn't understand the *task* to use it.
There's an old saying that if you design a system that an idiot can use, only an idiot would *want*
to use it. We all accept, for example, that the user of a word processor needs to understand that
if he defines some terms in one paragraph, uses them in a subsequent paragraph, and then cuts-and-
pastes the latter paragraph to a point before the one that contains the definitions, he's messed up
the document. We don't declare a document "hard to edit" or "hard to read" simply because it's not
robust to an arbitrary permutation of its contents (that is, unless we're academic postmodernist
rebels looking for a cause).
>few years ago, we sold a bill of goods to the IT community. We've
>delivered on it, but we've also introduced many unforeseen maintenance
>costs, that the buyers never planned on, and would have caused them to
>think twice, had they known in advance. In that sense, the W3C isn't
>much better than a used car dealer.
If the IT community thought that XML would enable clue-free processing of data, that's its fault,
not the XML community's. What the XML community promised and what business journalists wrote about
XML are two completely different things.
>It's even worse with XML Schemas because an element isn't just valid in
>the context of its parent. Potentially the entire tree up to the root
>is the validating context. In this case, based on the schema, one
>element named 'blah' might mean something completely different from
>another element named 'blah', even if they have the exact same parent
>element, and even if they're in the same namespace. Quite the exception
>from DTDs, eh?
Once again, meaning is determined by context in *any* language. One of the characteristics of XML
is that position-in-hierarchy conveys meaning. If you don't like that, then remember that dots are
acceptable name characters, so nothing's stopping you from using fully-path-qualified element type
names. You can write "MS Hungarian" XML if you so desire. With enough effort, you could represent
hierarchical data as a "flat" set (only looking like a list) of elements that could be arbitrarily
permuted. But then you'd have to manually renumber/rewrite all the names if you wanted to add or
delete an item. IMHO that would be "harder to edit" than a document where hierarchy conveys
meaning. And which is easier to read:
<contacts>
<contact>
<name>
<first>Eric</first>
<last>Bohlman</last>
</name>
<address>
<number>3631</number>
<street>S. Wallace Ave.</street>
<city>Chicago</city>
<US_state>IL</US_state>
<US_ZIP>60609</US_ZIP>
</address>
</contact>
</contacts>
or, liberating the data from a linear reading,
<com.omsdev.contacts>
<com.omsdev.contacts.contact_1.address.city>Chicago</com.omsdev.contacts.contact_1.address.city>
<com.omsdev.contacts.contact_1.name.last>Bohlman</com.omsdev.contacts.contact_1.name.last>
<com.omsdev.contacts.contact_1.address.number>3631</com.omsdev.contacts.contact_1.address.number>
<com.omsdev.contacts.contact_1.name.first>Eric</com.omsdev.contacts.contact_1.name.first>
<com.omsdev.contacts.contact_1.address.US_ZIP>60609</com.omsdev.contacts.contact_1.address.US_ZIP>
<com.omsdev.contacts.contact_1.address.US_state>IL</com.omsdev.contacts.contact_1.address.US_STATE>
<com.omsdev.contacts.contact_1.address.street>S. Wallace
</com.omsdev.contacts.contact_1.address.street>
</com.omsdev.contacts>
?
A common characteristic of language is the use of shorthand, in one which trades the need to
understand context for the need to deal with extensive verbiage. Spelling everything out often
makes a text *harder* to read rather than easier (though of course it depends on how you read it;
if you're forced to read through a tiny peephole, spelling everything out is probably easier).