[
Lists Home |
Date Index |
Thread Index
]
- From: Fernando Cabral <fernando@pix.com.br>
- To: xml-dev@ic.ac.uk
- Date: Thu, 05 Nov 1998 18:13:30 +0200
Borden, Jonathan wrote:
> For example, suppose I am searching for big apples:
>
> "This is a little green apple. Big deal."
>
> will "Big near apple" match?
> how about "Big applied to apple"
This will not be a poblem with any "decent" text retrieval engine because:
a) proximity search can be performed either "ordered" or "non-ordered". This is
quite powerful because it allows you to search for "big near potato" in the
sentece
"This is a small potato, big brother"
either to find both "potato, big" as well as "big, potato" or only one
of the two.
Some search engine, like Stairs (the grandfather of all text-retrieval
engines)
and BRS have two operator like "near" (or "prox") and "ADJacent", the
first one being unordered, the second one being ordered.
b) Usually search engine know what phrases and paragraphs are. I don't think
proximity should go beyond a period or any other punctuation that ends
a sentence. If you want to search in larger units, like a paragraph, then
you could always define something like "apple SAME PARAGRAPH big"
or "apple SAME SENTENCE big", both of with extend the idea
of "nearness" providing a more logical view of the terms.
c) finally, growing from the very close vicinity (near/adjacent) to a little
further (same sentence/same paragraph) you can go to the whole
"universe" with AND, OR, XOR, etc. What this means is that
you can have a very good control not only on which words you
want, but also where they, how far apart they can be, which one
comes first...
d) XML allows you to use all the above operators adding a very
useful feature: tag-qualification.
- fernando
--
Fernando Cabral Padrao iX Sistemas Abertos
mailto:fernando@pix.com.br http://www.pix.com.br
mailto:Pix@Pix.com.br
Fone: +55 61 321-2433 Fax: +55 61 225-3082
15º 45' 04.9" S 47º 49' 58.6" W
19º 37' 57.0" S 45º 17' 13.6" W
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
|