Lists Home |
Date Index |
- To: <firstname.lastname@example.org>
- Subject: RE: [xml-dev] Statistical vs "semantic web" approaches to making sense of the Net
- From: "James Governor" <email@example.com>
- Date: Fri, 25 Apr 2003 07:46:06 -0500
- Thread-index: AcMKwSs/+DBmnkGRR8Km+663SWtMBwAZnugw
- Thread-topic: [xml-dev] Statistical vs "semantic web" approaches to making sense of the Net
Here is a post that didn't stick last time for some reason, which Mike
Champion responded to here -
Thinking about Emergence and adaptive behavior and how it relates to
semantic versus AI
-- Steven Johnson has some interesting ideas about meaning in networked
While not directly referring to the semantic web, if I remember right,
he has a pretty neat theory about the role of computers when it comes to
an emerging connected intelligence.
It is of course humans that provide semantics and understanding, not
computers. On the other hand, computers can, and already do, play an
intriguing role in helping us to identify what is of importance and what
That is, through monitoring our behavior. Monitoring the behavior of
individuals that use the web, to deliver scores and so on, based on the
importance of that information.
It is only by interfacing with the resource that meaning is
generated---if computers are tracking our interactions, then
statistically they can usefully help to analyze what has "meaning" and
what does not. This potentially creates meaning when these systems
include feedback loops.
Thus complex semantic behaviour emerges from simple rules, while IT
plays the role of keeping score; and feeding back.
Slashdot is one example Johnson uses. as he points out however, at the
moment Slashdot is scored according to the tyranny of the masses---that
is, the folks that the majority agree with end up being moderators.
However Johnson posits a simple rule change which would add more
diversity to the meaning of the dialogues within the Slashdot
system--that is, by choosing moderators based on the diversity of
opinion they generate. "dissenters" must have a voice--because quite
often that voice is saying something useful in the wilderness.
That is not to say "whackos" should have a voice---scoring based on
quality is still crucial to the overall system. The folks that make
quality responses that also arouse passions in the listener are likely
to generate the most "quality" in terms of ideas and feedback.
Suffice to say that meaning can emerge from behavior. Let's look at
Amazon rankings and recommendations for example. Does Amazon know what
are good books or records or whatever. Absolutely not. But WE do. It is
by monitoring the behavior of large numbers of people that we get a
sense for what is good or not.
What I am trying to say - is that we perhaps need a more sophisticated
understanding of the role of computers and networks in helping us with
decisions of meaning. Johnson provides some interesting potential
pointers in that direction.
In a way--sorry to get on the hobbyhorse again folks-- this brings us
back to Wittgenstein's theory of "meaning as use" - the meaning of a
word or concept is related to its use by communities. Rather than to any
What is perhaps extraordinary and fundamentally different about the web
and connected networks is that now we begin to have a mechanism for
understanding and auditing how these concepts are "used" by different
sets of communities. That is we can see "meaning as use" in action--but
seeing who is listened to, what data sources are "trusted".
I hope the above rant doesn't seem too tangential, but I really wanted
to at least point to an alternative conception of "meaning" and how it
related to computers. Computers are not good at making subjective
judgments and or semantic distinctions, but they are (the web is)
perhaps the only way to track the subjective judgments of vast numbers
of people---from this analysis meaning and semantics can arise.
I am not sure that AI or semantic web approaches are the only ones--for
one thing, as as someone else has pointed out--relying on folks to do
decent semantic markup is most unlikely, if not possible. I mean--how
many of you do a decent job documenting the code you write?
However - the net can be used to identify what IS well marked up--by
seeing who, and how many, use it and ranks it highly--and in the end
that could be said to be what has "meaning".
Then there is the potential to introduce new rules based on this--wait
for it folks--metadata. That is right, it turns out the computers in
this scenario generate useful metadata---data about data. But that is
based on how, when, why the data itself is used, rather than by
identifying the "meaning" of specific elements. The relationships
between resources, elements and people, are where we find meaning.
Computers are very useful tools for monitoring these interactions.
Obviously this is a form of statistical analysis, but I figured it was
relevant and differentiated enough to be worth mentioning in this