xml-dev - Re: [xml-dev] Statistical vs "semantic web" approaches to making sense o

Re: [xml-dev] Statistical vs "semantic web" approaches to making sense o

[ Lists Home | Date Index | Thread Index ]

To: "Mike Champion" <mc@xegesis.org>, <xml-dev@lists.xml.org>
Subject: Re: [xml-dev] Statistical vs "semantic web" approaches to making sense of the Net
From: "Jonathan Borden" <jonathan@openhealth.org>
Date: Sat, 26 Apr 2003 16:17:01 -0400
References: <oprn3z6mj7ezizxn@localhost>

Mike Champion wrote:
>
> This raises a question, for me anyway:  If it will take a "better Google
> than Google" (or perhaps an "Autonomy meets RDF") that uses Baysian or
> similar statistical techniques to create the markup that the Semantic Web
> will exploit, what's the point of the semantic markup?  Why won't people
> just use the "intelligent" software directly?  Wearing my "XML database
> guy" hat, I hope that the answer is that it will be much more efficient
and
> programmer-friendly to query databases generated by the 'bots containing
> markup and metadata to find the information one needs.  But I must admit
> that 5-6 years ago I thought the world would need standardized, widely
> deployed XML markup before we could get the quality of searches that
Google
> allows today using only raw HTML and PageRank heuristic algorithm.
>
> So, anyone care to pick holes in my assumptions, or reasoning?  If one
does
> accept the hypothesis that it will take smart software to produce the
> markup that the Semantic Web will exploit, what *is* the case for
believing
> that it will be ontology-based logical inference engines rather than
> statistically-based heuristic search engines that people will be using in
> 5-10 years?  Or is this a false dichotomy?

Yes this is an entirely false dichotomy but you've asked an extremely
important question.

Forget all the hype that we've been hearing about the SW/AI etc and let's
look at what the current reality is -- OWL is *fundamentally* about
classifications. OWL "reasoners" are rightly termed "classifiers" but OWL
doesn't employ statistics -- a thing is or isn't a member of a class.

To link OWL type classifiers with real world data, there must be a leap that
puts something into a class in the first place and this is where
statistical-type processors might function. Let's use the following example:
Suppose we have a bunch of noisy binary data about a group of people some of
whom let's say have SARS, some of the data might be audio, some video, some
text etc etc.

Now suppose we have a statistical process that is able to cluster
individuals together in groups. This processor might emit the following
class:

<owl:Class rdf:ID="Foo">
    <owl:oneOf rdf:parseType="Literal">
        <ex:person rdf:resource="#Bill"/>
        <ex:person rdf:resource="#Dave"/>
        <ex:person rdf:resource="#Sue"/>
        <ex:person rdf:resource="#Nancy"/>
        <ex:person rdf:resource="#Freddy"/>
    <owl:oneOf>
</owl:Class>

our reasoner might be able to derive that

<owl:Class rdf:ID="Bar">
    <owl:intersectionOf>
         <owl:Class rdf:resource="#hasCough"/>
         <owl:Class rdf:resource="#hasFever"/>
         <owl:Class rdf:resource="#hasVirus.x233444"/>
...

#Foo owl:subClassOf #Bar

and even, in the proper circumstances that...

#Bar owl:sameClassAs #SARS

so the Bayesian/statistical processes might be very well used to jumpstart a
logical classification process that tells us something quite useful.

Jonathan

Follow-Ups:
- and duh! Re: [xml-dev] Statistical vs "semantic web" approaches to making sense of the Net
  - From: "Jonathan Borden" <jonathan@openhealth.org>
- RE: [xml-dev] Statistical vs "semantic web" approaches to making sense of the Net
  - From: "Danny Ayers" <danny666@virgilio.it>

References:
- Statistical vs "semantic web" approaches to making sense of the Net
  - From: Mike Champion <mc@xegesis.org>

Prev by Date: Re: [xml-dev] Roger Costello: My Version of "Why use OWL?"
Next by Date: RE: [xml-dev] Roger Costello: My Version of "Why use OWL?"
Previous by thread: Re: [xml-dev] Statistical vs "semantic web" approaches to making senseof the Net
Next by thread: RE: [xml-dev] Statistical vs "semantic web" approaches to making sense of the Net
Index(es):
- Date
- Thread