Lists Home |
Date Index |
- From: "Borden, Jonathan" <email@example.com>
- To: <firstname.lastname@example.org>
- Date: Thu, 5 Nov 1998 14:23:35 -0500
Let me rephrase that: word/character proximity searching has been done for
decades and its utility is well known. The last time I addressed this in
detail was when I spent some time on the Hearsay project which was an early
speech recognition system during the early 1980's. The problem of german or
oriental words/phonemes/sentences etc. is fairly similar (perhaps identical)
to the problem of english language speakers who slur their words together.
Speech processing programs have made great recent strides yet this has been
a difficult nut to crack.
There are many people who believe that further refinements of these well
known techniques are unlikely to yield dramatic improvements. Instead there
are avenues of attack which operate at higher levels on the information food
chain, namely at the word phrase, syntactic and semantic levels. These
levels are well represented as grove structures and XML/SGML search
techniques will likely yield significant results. Natural language
processing algorithms naturally express their output in groves and
intelligent search is at this crossroad.
For example, suppose I am searching for big apples:
"This is a little green apple. Big deal."
will "Big near apple" match?
how about "Big applied to apple"
> Borden, Jonathan wrote:
> > As you say Word/Character proximity searching is not that
> interesting, and
> > if this is desired, XML doesn't have much to add to the current equation
> I beg to disagree twice. a) proximity search is very important for any
> one searchingany reasonably-sized database with a variety of
> texts; b) XML can
> help a lot,
> even thou most non-XML capable search engines can already offer proximity
> We have bee able to solve quite a number of problems using
> proximity. If we
> did not have it we could still be able to solve those problems
> albeit spending
> much more effort, time, intelligence and CPU cicles.
> - fernando
xml-dev: A list for W3C XML Developers. To post, mailto:email@example.com
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:firstname.lastname@example.org the following message;
To subscribe to the digests, mailto:email@example.com the following message;
List coordinator, Henry Rzepa (mailto:firstname.lastname@example.org)