xml-dev - Re: [xml-dev] Something altogether different?

Re: [xml-dev] Something altogether different?

[ Lists Home | Date Index | Thread Index ]

To: "Murali Mani" <mmani@cs.wpi.edu>
Subject: Re: [xml-dev] Something altogether different?
From: "Ken North" <kennorth@sbcglobal.net>
Date: Mon, 25 Apr 2005 21:54:12 -0700
Cc: "'XML Developers List'" <xml-dev@lists.xml.org>
References: <15725CF6AFE2F34DB8A5B4770B7334EE07206E84@hq1.pcmail.ingr.com> <a06020400be92acb49acf@[192.168.1.101]> <001a01c549d6$09c6ce20$1601a8c0@DURANTE> <Pine.LNX.4.58.0504251642300.3409@cs.wpi.edu>

> I believe we can use vector-space model only when the document collection
> is "homogeneous" in some manner.. and has repetitive words etc.
>
> Also note -- vector space model, you have to obtain rank of documents in
> real-time given a query.

Cohen's '99 WHIRL paper discusses the ranking heuristics, the storing of
similarities instead of computing them in real-time, and the use of views to
persist information about the highest-scoring answers:

"Fortunately, in most cases, it is not necessary to compute all answers to a
query, as only the high-scoring answers will be of interest. WHIRL's inference
algorithms are thus designed to finds a few good answers to a query, without
generating all possible answers. The operations most commonly performed by a
user (or program) interacting with WHIRL are to define and r-materialize views.
To r-materialize a view, WHIRL finds the "r" highest-scoring ground atoms "a"
associated with a view, and store those facts in the EDB (extensional database)
for later use."

> For other metrics such as say pagerank, rank of documents can be
> pre-computed, and we can use better algorithms based on this property.

In the "Recommending Music by Crawling The Web" paper, Cohen and Fan researched
music preferences by spidering the web and using four different scoring
algorithms: popularity, K-nearest neighbor, weighted majority and a extended
direct Bayesian prediction.

In a 1998 paper, Cohen, Shapir and Yagir discussed the use of a preference
function when determining ranking (excerpt below):
http://citeseer.ist.psu.edu/cache/papers/cs/17244/http:zSzzSzdnkweb.denken.or.jpzSzboostingzSzpaperszSzCohSchSin98.pdf/cohen98learning.pdf

Learning to Order Things
There are many applications in which it is desirable to order rather than
classify
instances. Here we consider the problem of learning how to order, given feedback
in the form of preference judgments, i.e., statements to the effect that one
instance
should be ranked ahead of another. We outline a two-stage approach in which one
first learns by conventional means a preference function, of the form PREF
... Nevertheless, we describe a simple greedy algorithm that is guaranteed to
find a
good approximation. We then discuss an on-line learning algorithm, based on the
"Hedge" algorithm, for finding a good linear combination of ranking "experts."

References:
- RE: [xml-dev] Something altogether different?
  - From: "Bullard, Claude L (Len)" <len.bullard@intergraph.com>
- RE: [xml-dev] Something altogether different?
  - From: "Steven J. DeRose" <sderose@acm.org>
- Re: [xml-dev] Something altogether different?
  - From: "Ken North" <kennorth@sbcglobal.net>
- Re: [xml-dev] Something altogether different?
  - From: Murali Mani <mmani@cs.wpi.edu>

Prev by Date: Re: [xml-dev] Something altogether different?
Next by Date: Re: [xml-dev] Something altogether different?
Previous by thread: Re: [xml-dev] Something altogether different?
Next by thread: RE: [xml-dev] Something altogether different?
Index(es):
- Date
- Thread