[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Trusting the Semantic Web: Facts and Points of View

From: "Bullard, Claude L (Len)" <clbullar@ingr.com>
To: Jay Zhang <jayzhangsj@hotmail.com>
Date: Thu, 17 May 2001 11:02:46 -0500
Hi Jay:

The approach is good.  The challenges are:

o  having a trusted RDF system that represents 
relationships among notions necessary for 
understanding the facts (the so called 
concept map)

o using news archives or other sources that 
can be falsified or may contain erroneous 
information

o determining credibility.  

Again, we are back to quality of source issues; can it 
be gamed?  We deal with this a lot in 
public safety systems.  The change audit 
controls are fundamental.  Otherwise, 
the local biker gang gets one of there 
girlfriends on the IT staff and she 
does the dirty deed.  It happens.

Who says the SW has to do a better job 
than human beings:

1.  Operational issues.  What transactions 
are automatically committed by results returned?

2.  Stability issues: How far and how fast does a commitment 
broadcast (it's an amplifier)?  Or in other 
words, what is the affective range?  Can propagation 
destabilize the reasoning by introducing false 
or superstitious facts?

We can do the logic but again and again 
we come back to authority, legitimacy, 
and quality.   The high dollar sources 
for SW knowledge bases will use standard 
published procedures to create these (vetted) and 
they will protect the contents.  That raises 
some other bugaboos of who owns which information 
and who protects them.  You own the public 
safety data; you pay to maintain it; you 
have limited access and you cannot update 
it without going through the courts.  Even 
then, the expungement and purging rules vary among 
states, districts and courts.  Authority, 
legitimacy, quality of source:  all before 
the first if selects the first else.  For example, 
a very large portion of your police records 
move through the local, state and federal food 
chains to reside with the Feds (FBI).  How good 
are they are getting you the records you need 
when you need them? (Roll on McVeigh).  We 
could assume XML might improve that situation 
but they don't use it for NIBRS.  

***THEY SHOULD.***  But they should also pull 
instead of push that kind of data.  Maybe someday.

Goebbels is a known evil and a dead one. 
He took a lot of folks with him.  For more 
mundane sources, Michael Jackson will do: 

"The lie becomes the truth.  Billie Jean's 
not my lover..."  That is the broadcast 
problem.  Using the semantic web to check 
a reference will encounter all of the known 
problems of open text indexing and searching. 
So the only recourse is the authoritative history 
that all the suspects have to agree on.  Otherwise, 
the answer should come back:

"It can't be determined, but these are the 
published opinions..." and then it becomes a
footnote.

The term "XML" doesn't show up until the 
SGML On The Web group is well underway. 
That ain't 1994, so the assertion based 
on the term would return false.  The problem 
is figuring out exactly what the assertion 
is using perhaps NLP techniques, but it 
would be easier to ask a human being first. 

Who sez?

Len 
http://www.mp3.com/LenBullard

Ekam sat.h, Vipraah bahudhaa vadanti.
Daamyata. Datta. Dayadhvam.h


-----Original Message-----
From: Jay Zhang [mailto:jayzhangsj@hotmail.com]

For the particular example, semantic web is probably a
good solution.

If we have a trusted RDF system that represents the
relationships among notions necessary for understanding
these facts related to the origin of XML: mark-up,
XML/SGML/HTML, newsgroup, mailing list, start,
subset/restriction, etc. If my machine is powerful
enough to parse large number of English sentences into
a simple structure like:

<assertion date="05-23-99" tone="firm">
<subject>XML</subject>
<verb>is a subset of</verb>
<object>SGML</object>
</assertion>

then I can run this system through all newsgroup achives
and Web pages to verify assertions such as "XML started
in May 1994 at CERN".

When contradictory statements are encountered, the system
should be able to determine the level of credibility. When
a CERN-affiliated individual (again easy to check the
affiliation on Web) is talking about CERN greatness, we
should assign a lower weight. A voting system weighted
with credibility should do.

The infamous motto of Hitler's prapaganda chief Goebel
(not Goedel) is: repeating a lie 1000 times makes it
truth. Who says that semantic web has to make better
judgement than human beings?

Is it a good thesis project for someone to do a reference
checking system over the Web? I expect it to conclude that:
"The term XML was first used in May 1994 at CERN", but
"subsetting SGML was discussed (by someone not related to
CERN prior to 1994)".
Prev by Date: Re: ANN: XML Schema: DOs and DON'Ts
Next by Date: Namespace: what's the correct usage?
Previous by thread: Last Call review begins for XInclude
Next by thread: DTD's
Index(es):
- Date
- Thread