[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: First Order Logic and Semantic Web RE: NPR, Godel, Semantic W eb
- From: "Bullard, Claude L (Len)" <firstname.lastname@example.org>
- To: Joel Rees <email@example.com>, Jeff Lowery <firstname.lastname@example.org>
- Date: Wed, 16 May 2001 11:09:20 -0500
Sort of. We call it XML. Namespaces make it
a little goofy but that is because they still
can't figure out why the lawyers are laughing
at them. The lawyers know what the sw engineers
want to avoid because sort of like the X-Files,
they understand the concept of higher authority
then the geeks at MIT.
The aspect of XML so innate that we tend to
overlook it after awhile (fades into ye
olde gestalt background) is that we use
markup to *precisely* annotate text. Otherwise,
HTML would do the job. This goes back to
the Quality of Service and Quality of Source
issues. Someone using a standard vocabulary
to markup text has a rather good chance of
increasing the precision of indexing engines.
When you look at all the nine yards of
statistical junk (proximity, frequency,
cooccurrence, etc) that an automated indexing
engine needs to classify a document, then
compare that to the single datum of knowing
it's DOCTYPE, you see that things improve a
lot with regards to assertion checking about
the content. For a system that answers questions,
the same quality properties are there.
When it was suggested early in the XML rhubarb
that DTDs would go away, (well-formed only),
I laughed. It removes the biggest advantage
of SGML: standard vocabularies for focused
domains, the easy means to annotate a text with inline
metainformation for interpretation. Now people
are defending DTDs against the next new thing
and so it goes, but the principle remains: once
you get beyond a simple message, well-formedness
isn't enough. You need the metadata to get around
the outrageous and inefficient noise reduction
techniques of open text searching.
IOW, a well-marked document source is a primary
key to the use of the source particularly with
regards to interpretation. As in the example
I pointed out earlier, it is a heckuva lot
better to know something is marked as a point-of-view
vs a fact. The URIness of it might be used to
tell you who did that. You may have a history
with terms originating from that URI and over
time, you may develop trust or distrust of the
source. This system can still be 'gamed' but
it is hard to sustain. There will be questions
it can't answer because the facts don't close
the query. Rumors depend on anonymity.
Mission critical operations aren't committed to
rumor-filled transactions. So again, we are
back to operational solutions, choosing sources
well, rules to disregard non-closing queries,
("no" and "I don't know" are perfectly good answers)
No magic but experience.
There were lots and lots of genCoded languages
before HTML, some much better done. It thrived
on free software, colonization, and the naivete
of the users. That is a historical occurrence
like the Beatles, sweet, cute, right place at
the right time and unlikely to happen again.
XML is a distillation of all the work done
in markup to date. It also won't be reproduced.
It still requires skill to apply well. The
semantic web designs partake of all the AI
work done since the fifties and all of the
work in bibliographic systems since the middle
ages. We have the experience. We don't have
practice at this scale and for that reason if
no other, I suggest that local domains based
on common vocabularies will initially do the
heavy lifting. Standard vocabularies, concept
maps (eg, topic maps) etc. improve the situation
immeasurably because the system can know if
using the term "instrument" in a query to
ask about financial institutions vs music stores.
When fly-by-wire guidance systems were first
introduced (an expert system for airliners)
they scared the designers witless. In fact,
some of them did fly jets into the tarmac
and there were horrendous accidents (chaos
outs complexity and real time systems courses
don't treat chaos theory lightly). However,
everytime you get on an Airbus and cross the
Atlantic, a bot is at the controls with a
human pilot manager. So, with experience,
it can be done.
Just don't fly on the first one.
Ekam sat.h, Vipraah bahudhaa vadanti.
Daamyata. Datta. Dayadhvam.h
From: Joel Rees [mailto:email@example.com]
> The web is an amplifier. Deal with it accordingly.
Brings up another question. Has the SW team produced any concrete means of
dealing with the authority issues?