[
Lists Home |
Date Index |
Thread Index
]
- From: Joshua Allen <joshuaa@microsoft.com>
- To: 'Dave Winer' <dave@userland.com>,"Bullard, Claude L (Len)" <clbullar@ingr.com>,Rick JELLIFFE <ricko@geotempo.com>
- Date: Mon, 23 Oct 2000 11:53:42 -0700
Speaking of inference rules, many of you have seen Princeton's WordNet
(http://www.cogsci.princeton.edu/~wn/). The forms of relationships
(syno/anto, hypo/hyper, mero/holo) seem pretty useful for semantic tagging.
WordNet and MSR's MindNet are both pretty massive databases. I have written
a bunch of code that screen-scrapes dictionary.com and parses out the
definitions of every word (and pointers to synonyms, etc.) to form such a
database. I used this to create a lemmatizer, but synonym information was
obviously easy to extract. I bring this up simply to point out that I
thrashed around quite a bit to produce a lexicon which was rather inferior
to WordNet, and there are some things that could maybe be learned from
WordNet:
* a set of inference rules taken from classic object-oriented programming is
not going to give necessarily good results, the inference rules should be
bent to match the way we understand things and not bent to make it easy for
a computer to process. WordNet has "sense of" which is key to making the
semantics actually useful (at least I think so).
* lots of human scrubbing
* straight hierarchical (as in node-labeled graphs) relationships can give
strange outcomes (at least in the lexicon I built!). WordNet is more like
an edge-labeled graph with weights (I think they based weights on frequency
of occurrence with some human scrubbing).
Of course, I know that in the OPML/UserLand sense of outlines, outlines can
link into one another, so outlines don't imply a tree structure always. In
fact, I am not even close to being an expert on the topic of semantic
relationships. And also concur that building a lexicon for lemmatization is
not the same as a defining semantic rules for a limited test domain. Just
from my own limited experience, if building inference rules in the future I
would try to borrow from WordNet's ideas of "sense of" and some fuzzy
weightings at a bare minimum.
> -----Original Message-----
> From: Dave Winer [mailto:dave@userland.com]
> Sent: Monday, October 23, 2000 6:19 AM
> To: Bullard, Claude L (Len); Rick JELLIFFE
> Cc: xml-dev@lists.xml.org
> Subject: Re: Soft Landing
>
>
> Wow, this has already been worth it. I suspected that a
> Semantic network
> could be modeled as a hierarchy. I have been doing hierarchy
> editors (also
> known as outliners) for a long long time. This is one of the
> reasons I'm
> interested in getting a model up and running. I can think of it as a
> thesaurus, one of my favorite writing tools, but what are the
> inference
> rules about? How would we use inference rules on a semantic
> web of members
> of the XML-DEV list? Dave
>
>
> ----- Original Message -----
> From: "Bullard, Claude L (Len)" <clbullar@ingr.com>
> To: "Dave Winer" <dave@userland.com>; "Rick JELLIFFE"
> <ricko@geotempo.com>
> Cc: <xml-dev@lists.xml.org>
> Sent: Monday, October 23, 2000 6:22 AM
> Subject: RE: Soft Landing
>
>
> > Semantic network: IS-A hierarchy. Think of it as a thesaurus
> > with inference rules.
> >
> > Are you really interested in a classification system like
> > that for list members? Modus Operandi?
> >
> > Len
> > http://www.mp3.com/LenBullard
> >
> > Ekam sat.h, Vipraah bahudhaa vadanti.
> > Daamyata. Datta. Dayadhvam.h
> >
> >
> > -----Original Message-----
> > From: Dave Winer [mailto:dave@userland.com]
> > Sent: Friday, October 20, 2000 4:52 PM
> > To: Rick JELLIFFE
> > Cc: xml-dev@lists.xml.org
> > Subject: Soft Landing
> >
> >
> > I wonder if anyone is interested in trying to set up a
> mini-Semantic Web
> of
> > content describing the people on this list, what their
> interests are, what
> > software they use, who else they know, etc. My poor little
> mind needs to
> try
> > something pragmatic to figure out what all this stuff means. Dave
>
|