xml-dev - RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] SemanticWeb per

RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] SemanticWeb per

[ Lists Home | Date Index | Thread Index ]

To: "Howard Katz" <howardk@fatdog.com>
Subject: RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] SemanticWeb permathread, iteration n+1
From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
Date: Sat, 5 Jun 2004 06:37:49 -0400
Cc: "XML Developers List" <xml-dev@lists.xml.org>
In-reply-to: <IKEOLCDFPBBPPAHGNKKOKEAAEMAA.howardk@fatdog.com>
References: <IKEOLCDFPBBPPAHGNKKOKEAAEMAA.howardk@fatdog.com>

At 10:21 AM -0700 6/4/04, Howard Katz wrote:
>I don't understand this last point, Elliotte. How can a properly designed
>application ask whether a document contains the information it needs without
>knowing about the document's structure? If you add information, you're most
>likely changing the structure, and consequently the schema. How can an
>application cope with ad hoc changes like that w/out looking at the schema,
>ie without doing validation?

Let me answer with an example. Suppose you want to extract today's 
news from Cafe con Leche, an invalid XHTML document. The following 
XPath will do it:

//html:today

(assuming the html prefix has been bound to the XHTML namespace in 
whatever environment you're using). You need to know nothing else 
about what surrounds the today element, where it's positioned in the 
document, or even how many today elements there are. You don't care 
what the today elements contain. You don't care what contains them. 
It is a very robust solution, much more so than solutions based on 
explicit knowledge that the today element is the seventh child of a 
td element that is is the first child of a tr element that is the 
first child of a table element that is a child of the only table 
element that is the second child of a body element that is the only 
body child of an html element which is the root element of the 
document.

At no point do you need to know the scheme for the page in order to 
extract information from it. Indeed if you tried to do that, you'd 
fail because the page is invalid and the relevant information is 
found in elements that don't even exist in the schema.
-- 

   Elliotte Rusty Harold
   elharo@metalab.unc.edu
   Effective XML (Addison-Wesley, 2003)
   http://www.cafeconleche.org/books/effectivexml
   http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA

Follow-Ups:
- Re: [xml-dev] The triples datamodel -- was Re: [xml-dev] Semantic Web permathread, iteration n+1
  - From: Bjoern Hoehrmann <derhoermi@gmx.net>

References:
- RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] Semantic Web permathread, iteration n+1
  - From: "Howard Katz" <howardk@fatdog.com>

Prev by Date: Re: [xml-dev] Semantic Web permathread, iteration n+1 (was Re: [xml-dev]InfoWorld agrees with Elliote Rusty Harold)
Next by Date: Re: [xml-dev] Semantic Web permathread, iteration n+1 (was Re:[xml-dev] InfoWorld agrees with Elliote Rusty Harold)
Previous by thread: Re: [xml-dev] The triples datamodel -- was Re: [xml-dev] SemanticWeb permathread, iteration n+1
Next by thread: Re: [xml-dev] The triples datamodel -- was Re: [xml-dev] Semantic Web permathread, iteration n+1
Index(es):
- Date
- Thread