[
Lists Home |
Date Index |
Thread Index
]
Index what? Ideas, ideas emerging from conversations, the conversations?
So far, what you are describing seems to be Google. Can you out Google
Google?
A semantic aggregator is a topical query engine that automatically
synthesizes topics and arranges them by a set of meta topics sometimes
known as 'annotations': opposes, in contrast to, supports, etc. The
topics are links and the annotations are links, usually out-of-line. This
is an old idea from the pre-web days sometimes found in the context of
researchers capturing corporate expertise. In it's older but less
robust form, it is an inverted index as found at the back of any decent
text (which is why this field was called bibliographic linking). Instead
of returning links, it returns a fully-formatted report.
So with that bit of insight from the WayBack Machine, Sherman, here
is a thought experiment:
XML-Dev is regularly harvested for ideas, some attributed, some not, by
readers, some lurkers, some contributors. These ideas might get
implemented or not, might get rephrased or reformulated to mimic invention,
or not. How would you index them to:
o Prove a source is THE source.
o Diagram the emergence of an idea
o Create permathread links for any idea that recurs
o Automatically derive proofs for propositions expressed
o Provide QOS metrics in the face of determined gamers
Use REST if you like, SOA if you like, remember I don't care about the
religious technical convictions, just the results. Think of it as
the autoDrill (a typical Google search is an exercise in drilling not
for a reference, but for an insight).
Remember we don't all speak English if you want to expand to blogs
from XML-Dev. XML-Dev is the easy test. Blogs are much harder.
len
From: Alan Gutierrez [mailto:alan-xml-dev@engrm.com]
Len was in a thread a while back, on Web 2.0, where I posited
the notion of a REST interface to full text search of syndicated
feeds, or blogs.
While we're at it, Len, did you think about that any further?
Reading through the article, the thing that strikes me is that
it that full text search of an XML document depends so much on
the structure of the document. If that document can be divided
into chapters, messages, articles, pages, etc, then it's best to
create a full-text index with application specific documents.
So, perhaps, the scaleable solution, is full-text engine that
is fed a XML documents, and a full-text indexing schema.
The existing schema langauges like to atomize documents, while a
full-text indexing schema might group their elements into
concepts, like paths, links, articles, and clues for ranking
articles based on conditions specified in XPath.
I've wanted to explore the use of Lucene in my document object
model, so I'd like to hear more about this.
|