[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Indexing XML
- From: Phil Ruelle <philr@iplbath.com>
- To: xml-dev@lists.xml.org
- Date: Mon, 21 May 2001 09:27:49 +0100
All,
Thanks for the help with my question on searching, I am now the
proud owner of a copy of fgrep and the dbxml source which makes
for great bedtime reading ;).
Having looked into things in more details I see that more than just
searching through documents for instances of specific text strings I
would like to index documents with keywords. More than that the
keywords need to be hierarchical and the indexes
importable/exportable between systems.
For example:
Suppose I have a large number of cooking recipes in the form of
XML documents (it makes a change form the ubiquitous 'customer
orders' example :)) and I want to categorise them according to
place of origin. This would allow a user to search for all Italian
recipes or all British recipes.
A particular user may be a bit of a connoissuer and wants to be
more specific with his indexing so he adds catagories for Naples,
Rome and Sicily. He can now search for Neopolitan recipes
specifically but if he searches on Italy the Naples recipes will still be
found (i.e the categories/keywords are hierarchical).
Furthermore his friend wants borrow his recipe for Neopolitan ice-
cream (bad example but I can't think of any other Neopolitan
dishes!) but he doesn't have the category/keyword for Naples so
the indexing information needs to be exported as well. Another
possibility is that the friend doesn't have the keyword Naples but
does have the keyword/category Southern Italy so there is a
question about merging user-defined categories (although I see
this as requiring user-input).
There are a number of issues here, including whether to store the
indexing information with the document or in a separate file (and
recombine them for exporting) and whether to use single or multiple
elements to implement the 'hierarchical' indexing scheme.
Any tips/hints/ideas/resources/etc for implementation schemes will
be gratefully received.
Many thanks,
Phil Ruelle