OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Xml is _not_ self describing

[ Lists Home | Date Index | Thread Index ]

Well said Leigh.

Jonathan

> 
> 
> > -----Original Message-----
> > From: Bullard, Claude L (Len) [mailto:clbullar@ingr.com]
> > Sent: 15 January 2002 20:46
> > To: 'Elliotte Rusty Harold'; 'xml-dev@lists.xml.org'
> > Subject: RE: [xml-dev] Xml is _not_ self describing
> > 
> > 
> > I can't wait to see the XML.COM condensed 
> > version of this thread. :-)
> 
> And me, because hopefully then I can read it and 
> understand what the real issue is.
> 
> (...pauses...)
> 
> D'oh!
> 
> --
> 
> Seriously though, I gave a talk recently, introducing Markup 
> and XML to some Medical Informatics students. I outlined the 
> overheads of writing custom parsers for custom formats; 
> suggested that providing additional rules to structure 
> data formats could improve the situation; then explained 
> why CSV is fragile and limited; and then introduced 
> labelled formats as the best solution.
> 
> I also made it clear that introducing grammatical rules 
> such as labelling doesn't necessarily say anything about 
> the meaning of the data following those rules 
> (cf: Edward Lear). That's for a higher layer.
> 
> They seemed to accept the benefits of this, and 
> understood where the limitations were. 
> 
> So aside from the philosophy (interesting as it is) it seems to me 
> there's a fairly simple message to get across. Is there any real 
> evidence that there's been a failure to communicate it, beyond 
> the existing marketing-technology disconnects?
> 
> Personally I'm not sure I've seen it. Most developers I've worked 
> with just approach XML as syntax, and don't expect a whole 
> lot more.
> 
> Cheers,
> 
> L.
> 
> -- 
> Leigh Dodds, Research Group, Ingenta | "Pluralitas non est ponenda
> http://weblogs.userland.com/eclectic |    sine necessitate"
> http://www.xml.com/pub/xmldeviant    |     -- William of Ockham
> 
> 
> > 
> > Is it there?  We can split some fine hairs here, but 
> > often meaning has to be discovered from clues found 
> > elsewhere and then projected onto the text.  Worse, 
> > the translations into an understanding readily shared 
> > can vary enormously such that any such original meaning 
> > is distorted or not provable as original until some 
> > acceptable number of texts are translated.  There are 
> > linear markings from the Mystery Hill site (American 
> > Stonehenge) which some claim are Phoenician but are 
> > hotly contested otherwise.  Before accepted, both 
> > the decipherers and the archaeologists have to 
> > find mutually reinforcing but quite separate 
> > evidence (previous examples of the text types and 
> > artifacts attributable to some past civilization). 
> > 
> > It may not be random but be meaningless:  see the 
> > problems of assuming some astronomical signals 
> > were meaningful because they were regular (rotating 
> > and emitting).  Non-randomness isn't meaningful 
> > per se.  One can assume that a wedge-shaped tablet 
> > found in a collection of such is if other evidence 
> > indicates the site is a library, then start building 
> > up example sets until the key is discovered or a 
> > dictionary is created that self-consistent to a 
> > tolerable degree.  Otherwise, a Rosetta Stone is 
> > required.
> > 
> > So it isn't that cut and dry.  As I said in my 
> > reply to Mike, you can be looking for math only 
> > to discover belatedly, possibly by accident, that 
> > they were just saying Hi: Cheops Slept Here.  Once 
> > you know about star alignments, some aspects of 
> > pyramid layouts make sense.  Unfortunately, 
> > so does Stonehenge, Mystery Hill and a myriad 
> > of other sites - but it can't be proved and 
> > may not be true in each or every case.
> > 
> > "Documents written in natural languages have meaning even if you don't 
> > speak those languages. They do carry information."
> > 
> > That is so but until you learn them or someone who has tells you, 
> > you don't know what they mean.  We are quite close to the 
> > "if the tree falls in the forest.." argument.  The best I can 
> > do is say, yes it has meaning to someone and yes, strictly 
> > speaking, by establishing the non-randomness is purposeful, not 
> > a side effect of another regular process, we can agree there 
> > is information there.  Shannon built modern communications 
> > by saying reproducibility, not semantics, are the key to 
> > designing communication systems.
> > 
> > That said, we of course agree about the value of tagging regardless 
> > of whether we have the descriptions.  XML is self-describing to 
> > the extent one understands the Rosetta Stone that is the 
> > XML 1.0 specification, then acquires by some evidence, a 
> > workable set of descriptions for the tag names.  Doctor Goldfarb 
> > often points to glossing as the original modern form of hypertext
> > and markup.
> > 
> > All other things being equal, given some XML instance, I sure 
> > do prefer a well-documented schema or DTD to reading someone 
> > else's code to discover what I am supposed to expect and 
> > what to do about it.  Or just Hide The XML and give me 
> > the stinkin' compiled application to install.
> > 
> > len
> > 
> > -----Original Message-----
> > From: Elliotte Rusty Harold [mailto:elharo@metalab.unc.edu]
> > 
> > At 12:17 PM -0600 1/15/02, Bullard, Claude L (Len) wrote:
> > 
> > >A label is not a name unless it is meaningful.
> > >Natural language is not self-describing unless
> > >you were taught it.
> > 
> > I guess it depends on what exactly you mean by "self-describing". I 
> > think a book about the English language written in English is 
> > self-describing in and of itself, whether anybody speaks English or 
> > not. However, leaving that aside there's a deeper assumption I want 
> > to cut off before it becomes too embedded in the debate.
> > 
> > Documents written in natural languages have meaning even if you don't 
> > speak those languages. They do carry information. They are not random 
> > strings of characters. I've been reading a lot about the theory and 
> > history of cryptography  lately, and it's amazing just how much 
> > information you can pull out of ciphered text, because, in fact it 
> > isn't random. It's harder to read ciphered text than unciphered text, 
> > but it's not impossible. And that's a world of difference.
> > 
> > Reading text in a language you don't speak, but which has not been 
> > deliberately encrypted, is a similar problem; and in fact some of the 
> > same techniques were applied to languages like Linear B and 
> > hieroglyphics that are used to break ciphers.
> > 
> > When a document is marked up, the information of the markup is there, 
> > whether we recognize it or not. It is a property of the text itself, 
> > not a property of our perception of the text. With appropriate work, 
> > experience, intelligence, and luck that markup can be understood. Can 
> > unmarked up text be understood as well? Yes, certainly; but markup 
> > adds to the information content of the text. It makes it easier to 
> > decipher its meaning in a very practically useful way. This is a 
> > question of degree, and text+markup is easier to understand than text 
> > alone.
> > 
> > Langauge is certainly important, but it is orthogonal issue.  Given 
> > the choice of data marked up in Ugaritic vs. the same data marked up 
> > in English, I pick English. But given the choice of data marked up in 
> > Ugaritic vs. the same data not marked up at all, I pick the data 
> > marked up in Ugaritic.
> > 
> > -----------------------------------------------------------------
> > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> > initiative of OASIS <http://www.oasis-open.org>
> > 
> > The list archives are at http://lists.xml.org/archives/xml-dev/
> > 
> > To subscribe or unsubscribe from this list use the subscription
> > manager: <http://lists.xml.org/ob/adm.pl>
> > 
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> 
> The list archives are at http://lists.xml.org/archives/xml-dev/
> 
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>
> 





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS