OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: xml search engine?

[ Lists Home | Date Index | Thread Index ]
  • From: Walter Underwood <wunder@infoseek.com>
  • To: <xml-dev@xml.org>
  • Date: Mon, 3 Apr 2000 10:18:56 -0700

Trying to keep this on XML issues, since search is a niche field ...

At 1:14 PM -0500 4/1/00, Didier PH Martin wrote:
>Didier replies:
>What is the DTD or schema of the returned XML document? Is it RDF based?
>xlink based? (probably not xlink based since it is too recent). If you
>return an RDF based XML document, then what are the properties included in
>each <rdf:description about="...."> elements?

Take a look. The DTD is inline. It also returns some statistics
about the terms and the database. We use those to merge results
from different collections with proper ranking (search for "patent"
on our site for more details).

It uses XLink, according an XLink draft (the one that didn't change
for over a year). Unfortunately, XLink changed incompatably for 
simple links ("href=" became "xlink:href=").


>Walter said:
>Surprisingly, the XML format has almost no advantages over text/plain
>in practice (and it was my idea).
>Didier replies:
>Why are you saying that? If I am receiving the result of a request packaged
>as an RDF or xlink document I can use an XSLT style sheet to transform it
>into a rendition language.

This format is only for use by programs. The HTML results page is
designed for browsers (and people) and has a rather different set
of information. For the program-target page, the actual XML structure
isn't very interesting, as long as it has all the info. An XML results
page targeted for browsers would have LOTS of additional stuff,
like query forms, search tips, thesaurus matches, flags to keep
track of simple vs. advanced query mode, pre-composed URLs for 
"find similar", etc.

For program-targeted information, the real interest is the underlying
data model. The biggest advantage of XML for this (over text/plain)
is that character encodings and escaping are already specified.

Finally, for this purpose, RDF is mostly an epicycle in the cosmology.
It represents the same things, but with excess mechanism. The original
formation of Okham's law is "do not multiply entities needlessly".
He wasn't talking about XML entities, but I find the coincidince
really amusing.

If there were common RDF schemas to choose from, that would change.

Walter R. Underwood
Senior Staff Engineer
Infoseek Software, part of go.com
wunder@infoseek.com (work)

This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS