[len] Coals to Newcastle,
Andrew.
The following "poetic gem" is said to have been
discovered at the base of the piles on which the Tay Bridge was built.
[len] An excellent bit of doggerel.
Compliments. But seriously, why are you down on
SemWeb?
1. They are still working out the technology. This
will take awhile to do right at
scale.
2. It is difficult to construct reliable applicable
ontologies. It is tedious work to sort out class
relationships
and get these defined in such a way as all or
even many experts on the subject will accept them. In
HumanML,
after a year of research, we ended up with a Primary Base
of abstract types, most of which have definitions
but
few inheritable properties. Why? The domain of
semiotics has lots of theories, two main camps
(Sausserian
and Peircian) and the differences range from mundane to
profound although overall, they agree on the
categories
of properties that form the context for and affect
human communication. So as good markup weenies, on
the
first pass, all we can safely declare is categories.
After that, secondary derived schemas have to handle
the
details and each from the point of view of that particular
camp. What I am saying here is that it takes a lot
of expertise to do a good ontology, so someone has to
become expert at sorting the bad ones.
3. It won't grow fast without automated
harvesting tools. You see the problem, of course. It takes
lots
of work to get good markup structures for any one category in
one domain, and as the
autopoiesis experts
point out, the structure tends to preserve itself regardless of
the environmental pressure. So for the SemWeb,
until the harvesting tools are working and delivering
acceptable results, the data grows at a snails' pace. Metadata
built by hand doesn't have the same payoffs as HTML
built by hand. HTML doesn't require assertion
consistency.
Very different
games.
Still, we know that inferencing engines work tolerably well as
long as we are not religiously or
superstitiously
attached to the results. As long as we know how to
frame questions well and completely understand
that
our framing of the question determines the results, we can use
these tools and the semweb data for
productive
tasks. If we, on the other hand, use the results
like political wonks use polling results, we are likely to
get
into trouble (The Golem Problem). I don't say that
won't happen; it will. But when it comes time to
take
after the monster, pitchforks aloft and torches blazing, we have
to remember that the user of the
information
is the bad guy, not the technology. My experience is
that for every bad guy, there are at least ten good
guys
doing productive and creative work with that technology and
those ontologies. And in the Spy Vs Spy world,
telling good from bad isn't as easy as one might
think.
Remember the Third Reich archaeologists. They were
scientists expecting to be taken seriously.
Outside
the Third Reich, they were mostly laughed at by their
peers. On the other hand, one has the
Japanese
researchers in bio-warfare who after the war, went on the
payrolls of many large nations and retired
with honor
and pensions. Applications usually are as good
or bad as the hands that apply them and the uses they
put them to, but also, who they use them for.
Knowledge is defined as the ability to select the right choice
given a set of equally probable
choices.
Seek to reduce Boltzman entropy in the world of human
understanding. Choose wisely.
len