OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: XML Search Engine

[ Lists Home | Date Index | Thread Index ]
  • From: "Borden, Jonathan" <jborden@mediaone.net>
  • To: <xml-dev@ic.ac.uk>
  • Date: Thu, 5 Nov 1998 14:23:35 -0500

Let me rephrase that: word/character proximity searching has been done for
decades and its utility is well known. The last time I addressed this in
detail was when I spent some time on the Hearsay project which was an early
speech recognition system during the early 1980's. The problem of german or
oriental words/phonemes/sentences etc. is fairly similar (perhaps identical)
to the problem of english language speakers who slur their words together.
Speech processing programs have made great recent strides yet this has been
a difficult nut to crack.

There are many people who believe that further refinements of these well
known techniques are unlikely to yield dramatic improvements. Instead there
are avenues of attack which operate at higher levels on the information food
chain, namely at the word phrase, syntactic and semantic levels. These
levels are well represented as grove structures and XML/SGML search
techniques will likely yield significant results. Natural language
processing algorithms naturally express their output in groves and
intelligent search is at this crossroad.

For example, suppose I am searching for big apples:

"This is a little green apple. Big deal."

will "Big near apple" match?
how about "Big applied to apple"

Jonathan Borden

> Borden, Jonathan wrote:
> > As you say Word/Character proximity searching is not that
> interesting, and
> > if this is desired, XML doesn't have much to add to the current equation
> I beg to disagree twice. a) proximity search is very important for any
> one searchingany reasonably-sized database with a variety of
> texts; b) XML can
> help a lot,
> even thou most non-XML capable search engines can already offer proximity
> searching.
> We have bee able to solve quite a number of problems using
> proximity. If we
> did not have it we could still be able to solve those problems
> albeit spending
> much more effort, time, intelligence and CPU cicles.
> - fernando

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS