OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: xml search engine?

[ Lists Home | Date Index | Thread Index ]
  • From: Dongwook Shin <dwshin@nlm.nih.gov>
  • To: xml-dev@xml.org
  • Date: Mon, 03 Apr 2000 15:04:35 -0400


Stemming is known to improve "precision" in a certain degree in
terms of precision/recall measurement. At the same time,
it contributes to reducing the index size.

But it totally depends on what kind of stemming algorithms
you use. For instance, National Library of Medicine does not
rely on a simple stemming algorithm, but usually develop more
accurate ones. A simple stemmer does not work for the complicated
medical terms.

Walter Underwood wrote:

> Probably due to phrase inference and phrase searching, since most
> of the words in the query are already in root form (exceptions:
> "got", "causes"). Stemming certainly helps, as do good ranking
> algorithms, page quality metrics, HTML parsers, etc.


Dongwook Shin
Visiting Scholar
Lister Hill National Center for Biomedical Communications
National Library of Medicine,
8600 Rockville Pike Bethesda 20894, MD
E-mail: dwshin@nlm.nih.gov
Tel: (301) 435-3257
FAX: (301) 480-3035
URL: http://dlb2.nlm.nih.gov/~dwshin

This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS