xml-dev - RE: xml search engine?

RE: xml search engine?
[ Lists Home | Date Index | Thread Index ]
From: Aleksi Niemelä <aleksi.niemela@cinnober.com>
To: xml-dev@xml.org
Date: Wed, 29 Mar 2000 17:28:23 +0200
> There are so many ways this could work out. Any opinions? 

I'm really no visioner nor technology guru, but I thought to write
this encouraged by your call for opinions. Here's mine.

>There is a problem I see for xml search engines. How are they going to
>cope with all the various DTD's? 

I'm not entirely sure search engines have to cope with them. I have no
idea whatsoever of the current solutions, but I think you could 
specify some simple form of search like (don't care about my mangled 
and extended examples):

company a:
//drug_component/text()=~!benometr.*sinytodi!

> Will we have lots of small search engines searching for
> information in all reinforced_concrete_supplier.dtd xml files it can
> find and another for all medicine.dtd info? Will there be a few
> standard elements in most DTD's to comply to some emerging behaviour
> of all search engines? 

Hopefully not, while I can see that small search engines might be more 
customizable for tricky tasks or automation.

Since you raised drug business as an issue, I think it won't take too much 
time nor too long before some DrugRepositoriesAreUs.com will make 
proposition for common tags to be used (probably superset of most widely
used). And then they'll present Schema with equivclasses for DTDs not 
conforming with their solution. So some other person hunting
for same information could write

company b:
//drug_components/chemical/*[@longname.contains(|sinytodi|)]

Equivclasses would then map company B's XML chemical childs's longname 
attribute to be comparable just like company A's drug_component text
content.
So both queries A and B will show same (about same) result set. My guess is
that engines have to have interface for DTD producers or registers to feed 
and update schemas.

And if it's not possible to develop enough powerful xpath and schema, 
one can write some XSL to transform company B's document to 
specialized form designed for good and easy searching.

I don't believe we'll see consensus on these issues any time soon. And even
my short examples show it quite clearly that those queries are well beyond
average user. So I could imagine good use for wizard-technology, which then 
generates queries behind the scenes. No one writes SQL to current search 
machines either, while that might provide interesting possibilities. 
For some particular, very common search, like finding all drugs containing 
some particular chemical, one could just present textbox where user can
write 
chemical names.

I guess producing simple yet powerful user interfaces for avarage users will
be interesting and hard thing to do. As long as XML-searching is considered 
to give advanced features over pure text based HTML-indexes nobody will
be disappointed. They will have some blue moments who think XML-searchers 
will turn present engines to the incarnation of net oracle replying correct 
answer backed up with source document list for every question raised.

	- Aleksi        the future is under construction


-----Original Message-----
From: Reinout van Rees [mailto:rr@cti036.citg.tudelft.nl]
Sent: den 29 mars 2000 11:43
To: xml-dev@xml.org
Subject: xml search engine?


On Tue, 28 Mar 2000, Jean Marc VANEL wrote:

> I know the Xyleme project at www.inria.fr that is starting to
> develop an XML-aware Web search engine.

I couldn't find it on their website, do you have a pointer to
information? This is *very* interesting!
 
> But it is clear that such products would be of the uttermost
> importance to XML architectures, because of the flexibility that it
> provides: no need to register services and data, the XML search
> engine finds them anyway.

There is a problem I see for xml search engines. How are they going to
cope with all the various DTD's? They ARE going to cope, but what will
be the result? Will we have lots of small search engines searching for
information in all reinforced_concrete_supplier.dtd xml files it can
find and another for all medicine.dtd info? Will there be a few
standard elements in most DTD's to comply to some emerging behaviour
of all search engines? There are so many ways this could work out. Any
opinions? 

greetings,

Reinout

--
Reinout van Rees => R.vanRees@ct.tudelft.nl  +31-15-278-5456
"There's good and evil in all of us.
 It's up to you alone which to follow" - Geoff Mann


***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************

***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************
Prev by Date: RE: xml spec 1.0 validity constraint for ID/IDREF
Next by Date: RE: xml spec 1.0 validity constraint for ID/IDREF
Previous by thread: Re: xml search engine?
Next by thread: Re: xml search engine?
Index(es):
- Date
- Thread