[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Trusting the Semantic Web: Facts and Points of View
- From: "Bullard, Claude L (Len)" <clbullar@ingr.com>
- To: Jay Zhang <jayzhangsj@hotmail.com>
- Date: Thu, 17 May 2001 11:02:46 -0500
Hi Jay:
The approach is good. The challenges are:
o having a trusted RDF system that represents
relationships among notions necessary for
understanding the facts (the so called
concept map)
o using news archives or other sources that
can be falsified or may contain erroneous
information
o determining credibility.
Again, we are back to quality of source issues; can it
be gamed? We deal with this a lot in
public safety systems. The change audit
controls are fundamental. Otherwise,
the local biker gang gets one of there
girlfriends on the IT staff and she
does the dirty deed. It happens.
Who says the SW has to do a better job
than human beings:
1. Operational issues. What transactions
are automatically committed by results returned?
2. Stability issues: How far and how fast does a commitment
broadcast (it's an amplifier)? Or in other
words, what is the affective range? Can propagation
destabilize the reasoning by introducing false
or superstitious facts?
We can do the logic but again and again
we come back to authority, legitimacy,
and quality. The high dollar sources
for SW knowledge bases will use standard
published procedures to create these (vetted) and
they will protect the contents. That raises
some other bugaboos of who owns which information
and who protects them. You own the public
safety data; you pay to maintain it; you
have limited access and you cannot update
it without going through the courts. Even
then, the expungement and purging rules vary among
states, districts and courts. Authority,
legitimacy, quality of source: all before
the first if selects the first else. For example,
a very large portion of your police records
move through the local, state and federal food
chains to reside with the Feds (FBI). How good
are they are getting you the records you need
when you need them? (Roll on McVeigh). We
could assume XML might improve that situation
but they don't use it for NIBRS.
***THEY SHOULD.*** But they should also pull
instead of push that kind of data. Maybe someday.
Goebbels is a known evil and a dead one.
He took a lot of folks with him. For more
mundane sources, Michael Jackson will do:
"The lie becomes the truth. Billie Jean's
not my lover..." That is the broadcast
problem. Using the semantic web to check
a reference will encounter all of the known
problems of open text indexing and searching.
So the only recourse is the authoritative history
that all the suspects have to agree on. Otherwise,
the answer should come back:
"It can't be determined, but these are the
published opinions..." and then it becomes a
footnote.
The term "XML" doesn't show up until the
SGML On The Web group is well underway.
That ain't 1994, so the assertion based
on the term would return false. The problem
is figuring out exactly what the assertion
is using perhaps NLP techniques, but it
would be easier to ask a human being first.
Who sez?
Len
http://www.mp3.com/LenBullard
Ekam sat.h, Vipraah bahudhaa vadanti.
Daamyata. Datta. Dayadhvam.h
-----Original Message-----
From: Jay Zhang [mailto:jayzhangsj@hotmail.com]
For the particular example, semantic web is probably a
good solution.
If we have a trusted RDF system that represents the
relationships among notions necessary for understanding
these facts related to the origin of XML: mark-up,
XML/SGML/HTML, newsgroup, mailing list, start,
subset/restriction, etc. If my machine is powerful
enough to parse large number of English sentences into
a simple structure like:
<assertion date="05-23-99" tone="firm">
<subject>XML</subject>
<verb>is a subset of</verb>
<object>SGML</object>
</assertion>
then I can run this system through all newsgroup achives
and Web pages to verify assertions such as "XML started
in May 1994 at CERN".
When contradictory statements are encountered, the system
should be able to determine the level of credibility. When
a CERN-affiliated individual (again easy to check the
affiliation on Web) is talking about CERN greatness, we
should assign a lower weight. A voting system weighted
with credibility should do.
The infamous motto of Hitler's prapaganda chief Goebel
(not Goedel) is: repeating a lie 1000 times makes it
truth. Who says that semantic web has to make better
judgement than human beings?
Is it a good thesis project for someone to do a reference
checking system over the Web? I expect it to conclude that:
"The term XML was first used in May 1994 at CERN", but
"subsetting SGML was discussed (by someone not related to
CERN prior to 1994)".