OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] indexing and querying XML (not XQuery)

[ Lists Home | Date Index | Thread Index ]

Index what?  Ideas, ideas emerging from conversations, the conversations?
So far, what you are describing seems to be Google.  Can you out Google 

A semantic aggregator is a topical query engine that automatically 
synthesizes topics and arranges them by a set of meta topics sometimes 
known as 'annotations':  opposes, in contrast to, supports, etc.  The 
topics are links and the annotations are links, usually out-of-line. This 
is an old idea from the pre-web days sometimes found in the context of 
researchers capturing corporate expertise.  In it's older but less 
robust form, it is an inverted index as found at the back of any decent 
text (which is why this field was called bibliographic linking). Instead 
of returning links, it returns a fully-formatted report.

So with that bit of insight from the WayBack Machine, Sherman, here 
is a thought experiment:

XML-Dev is regularly harvested for ideas, some attributed, some not, by 
readers, some lurkers, some contributors.   These ideas might get 
implemented or not, might get rephrased or reformulated to mimic invention, 
or not.  How would you index them to:

o  Prove a source is THE source.
o  Diagram the emergence of an idea
o  Create permathread links for any idea that recurs
o  Automatically derive proofs for propositions expressed
o  Provide QOS metrics in the face of determined gamers

Use REST if you like, SOA if you like, remember I don't care about the 
religious technical convictions, just the results.  Think of it as 
the autoDrill (a typical Google search is an exercise in drilling not 
for a reference, but for an insight).

Remember we don't all speak English if you want to expand to blogs 
from XML-Dev.  XML-Dev is the easy test.  Blogs are much harder.


From: Alan Gutierrez [mailto:alan-xml-dev@engrm.com]

    Len was in a thread a while back, on Web 2.0, where I posited
    the notion of a REST interface to full text search of syndicated
    feeds, or blogs.

    While we're at it, Len, did you think about that any further?

    Reading through the article, the thing that strikes me is that
    it that full text search of an XML document depends so much on
    the structure of the document. If that document can be divided
    into chapters, messages, articles, pages, etc, then it's best to
    create a full-text index with application specific documents.

    So, perhaps, the scaleable solution, is full-text engine that
    is fed a XML documents, and a full-text indexing schema.

    The existing schema langauges like to atomize documents, while a
    full-text indexing schema might group their elements into
    concepts, like paths, links, articles, and clues for ranking
    articles based on conditions specified in XPath.

    I've wanted to explore the use of Lucene in my document object
    model, so I'd like to hear more about this.


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS