Lists Home |
Date Index |
Mike Champion wrote:
> This raises a question, for me anyway: If it will take a "better Google
> than Google" (or perhaps an "Autonomy meets RDF") that uses Baysian or
> similar statistical techniques to create the markup that the Semantic Web
> will exploit, what's the point of the semantic markup? Why won't people
> just use the "intelligent" software directly? Wearing my "XML database
> guy" hat, I hope that the answer is that it will be much more efficient
> programmer-friendly to query databases generated by the 'bots containing
> markup and metadata to find the information one needs. But I must admit
> that 5-6 years ago I thought the world would need standardized, widely
> deployed XML markup before we could get the quality of searches that
> allows today using only raw HTML and PageRank heuristic algorithm.
> So, anyone care to pick holes in my assumptions, or reasoning? If one
> accept the hypothesis that it will take smart software to produce the
> markup that the Semantic Web will exploit, what *is* the case for
> that it will be ontology-based logical inference engines rather than
> statistically-based heuristic search engines that people will be using in
> 5-10 years? Or is this a false dichotomy?
Yes this is an entirely false dichotomy but you've asked an extremely
Forget all the hype that we've been hearing about the SW/AI etc and let's
look at what the current reality is -- OWL is *fundamentally* about
classifications. OWL "reasoners" are rightly termed "classifiers" but OWL
doesn't employ statistics -- a thing is or isn't a member of a class.
To link OWL type classifiers with real world data, there must be a leap that
puts something into a class in the first place and this is where
statistical-type processors might function. Let's use the following example:
Suppose we have a bunch of noisy binary data about a group of people some of
whom let's say have SARS, some of the data might be audio, some video, some
text etc etc.
Now suppose we have a statistical process that is able to cluster
individuals together in groups. This processor might emit the following
our reasoner might be able to derive that
#Foo owl:subClassOf #Bar
and even, in the proper circumstances that...
#Bar owl:sameClassAs #SARS
so the Bayesian/statistical processes might be very well used to jumpstart a
logical classification process that tells us something quite useful.