[
Lists Home |
Date Index |
Thread Index
]
--- "Roger L. Costello" <costello@mitre.org> wrote:
> A key point:
> 1. In an open system where there are interactions
> that cannot be predicted
> apriori, it is unreasonable to expect a "standard
> format". This is a key
> point that I argue in "Living in a Schemaless Web":
>
>
> http://www.xfront.com/LivingInASchemalessWeb.html
[from that document] "The above is not meant to imply
that there are no uses for schemas, only that they are
not particularly useful in a large, decentralized
system such as the Web. In fact, schemas are
well-suited on the fringes, where there exists a
closed system and all parties involved can agree to
exchange data in a specific format. "
I certainly agree with this, but defer judgement on
whether OWL-based semantic transformations will prove
to be any easier to implement, deploy, maintain,
evolve, etc. than XSLT (or XQuery) data model
transformations.
"If you wish to interact broadly across the Web then
ontologies provide a high return on your investment.
Ontologies are the basis for coping with diversity. "
That's a pretty strong assertion. What kind of real
experience is there to back it up? I would tend to
agree for fairly well-understood domains (e.g. medical
terminology, as Jonathan Borden has educated me on
this list over the years!) and, uhh, somewhat
contrived examples like cameras.
But how about the messy real world most of us must
operate in, where there is an intent to deceive
(spammers, virus writers, software companies with
patents on common sense, politicians starting wars [or
questioning the definition of "is"], ad nauseum)? How
about in pop culture contexts where meanings of words
are changed literally for the fun of it?
Even in less pathological or friviolous situations,
I'm not at all sure that starting by building an
ontology is a wise investment. As many of you are
aware, I've spent the last 1 1/2 years working on the
Web Services Architecture group at W3C, which might be
characterized as an attempt to define a [relatively
informal, although we are toying with an OWL
representation] ontology for web services concepts,
terminology, and concrete instantiations. It is, to
put it bluntly, a bitch: what appears to be common
sense to one person or organization is heresy to
another; the meaning of simple words such as "service"
tend to lead to infinitely recursive definitions; and
just when you think you start to understand things
with some degree of rigor, a new
analyst/pundit/consultant fad comes out of left field
to confuse things all over again. Imposing ontologies
up front is as politically/economically impossible as
imposing the One True Schema, and defining them post
hoc is difficult and inevitably incomplete/partially
inaccurate.
What really intrigues me is that for all the
theoretical interest in semantic approaches to
search/discovery/analysis over the past few years, the
actual advances in practical applications seem to come
from metadata generation and pattern matching
(Google), dirt-simple fuzzy or Bayesian classifers
(e.g. Spam Bayes), and brute force "kitchen sink"
combinations of it all (e.g. IBM "WebFountain, AFAIK
http://www.almaden.ibm.com/WebFountain/). I'm willing
to bet that there is some good synergy between
ontologies and the brute-force stuff -- for example I
would like to be able to give Spam Bayes some
knowledge of my world, e.g. I never spam myself, or a
message with no recognizeable words in it is almost
certainly spam. Still, I see the "dumb" approaches
working every minute of every day (about how often I
get spam!) and I'm not seeing the real world success
stories for the "smart" approach.
Anyway, I'm interested in hearing more about the real
experience base of people who start with ontologies to
wrestle with these kinds of problems at the semantic
level as opposed to those who basically approach it as
a querying/matching problem on the surface structure
of the data.
|