OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] indexing and querying XML (not XQuery)

[ Lists Home | Date Index | Thread Index ]
  • To: "Bullard, Claude L \(Len\)" <len.bullard@intergraph.com>
  • Subject: Re: [xml-dev] indexing and querying XML (not XQuery)
  • From: 'Alan Gutierrez' <alan-xml-dev@engrm.com>
  • Date: Tue, 23 Aug 2005 12:01:06 -0400
  • Cc: ElektonikaMail@izzy.net, 'XML Developers List' <xml-dev@lists.xml.org>
  • In-reply-to: <000001c5a7f7$88def140$0115a8c0@Elektonika.local>
  • Mail-followup-to: "Bullard, Claude L (Len)" <len.bullard@intergraph.com>,ElektonikaMail@izzy.net,'XML Developers List' <xml-dev@lists.xml.org>
  • References: <000001c5a7f7$88def140$0115a8c0@Elektonika.local>
  • User-agent: Mutt/1.4.1i

* Bullard, Claude L (Len) <len.bullard@intergraph.com> [2005-08-23 11:34]:
> From: Robert Koberg [mailto:rob@koberg.com]
> Bullard, Claude L (Len) wrote:
>> > Index what?  Ideas, ideas emerging from conversations, the conversations?
>> > So far, what you are describing seems to be Google.  Can you out Google 
>> > Google?
>> It is not like google. Google indexes HTML and it gives better rankings 
>> to well marked up (according to google) HTML (which is why small 
>> companies like us can get page rankings as high or higher than much 
>> larger companies).
>> With an XML indexer, you can index glossentries, faqs, quizes, whatever 
>> and keep them separate so if you want to run a query against just faqs, 
>> you can.
>> You can do a search to get all external links (we distinguish between 
>> external, internal and whatever other kind of links there might be) and 
>> validate them.

> So your index is as good as the markup?  Fine.  That's what markup
> was created to provide (the extensibility AS meaning theory).  The
> human does the intelligent analysis when they tag and the engine
> dutifully records that.  That is just another indexer, not a
> semantic aggregator.  Tagging makes searching easier by leveraging
> the author's intelligence. 

    I need to pull some of this into a blog entry, quote it. I'd be
    frustrated with you for being so dismissive, if I was confident
    that you're speaking from hard-won experience.

    I'd like to create a semantic aggregator, or something like it,
    through human intervention, but making it easier to specify the
    structure of indices, or by allowing for the adjustment of
    ranking by participants in a social network.

> What about content in non-tagged sources (say XML-Dev) and gamed
> content?

    Aren't these two very different problems?
    My solution to gamed content is accountability. That's were you
    depart from the hosted indices, hand have personal indices that
    have an individual's endorsement. 

> How about correlation of hidden couplers?

    Hidden couplers? Just Googled [hidden couplers] and the lucky
    spot was one of your "Is Web 2.0 the new XML?" postings.

> Show me an engine that can intelligently index because it can
> drill for insights and provide those to the user.  In other words,
> a deep analysis system can find the root causes for failures
> rather than superficial causes as might be tagged incorrectly
> something humans do consistently well:  promote superstition to
> knowledge.

> HTML scales because it is somewhat 'opinion free'.  Although
> layout is a form of opinionated expression, it can be ignored.  

    That's a nice engine. I'm not sure how to extract knowledge from
    opinion, or how to approach an algorithm that would eliminate
    bias. I think through accountability, you could begin to at
    least identify the bias.

Alan Gutierrez - alan@engrm.com
    - http://engrm.com/blogometer/index.html
    - http://engrm.com/blogometer/rss.2.0.xml


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS