OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] indexing and querying XML (not XQuery)

[ Lists Home | Date Index | Thread Index ]

'Alan Gutierrez' wrote:
> * Robert Koberg <rob@koberg.com> [2005-08-23 10:42]:
>>Bullard, Claude L (Len) wrote:
>>>Index what?  Ideas, ideas emerging from conversations, the conversations?
>>>So far, what you are describing seems to be Google.  Can you out Google 
>>It is not like google. Google indexes HTML and it gives better rankings 
>>to well marked up (according to google) HTML (which is why small 
>>companies like us can get page rankings as high or higher than much 
>>larger companies).
>>With an XML indexer, you can index glossentries, faqs, quizes, whatever 
>>and keep them separate so if you want to run a query against just faqs, 
>>you can.
>>You can do a search to get all external links (we distinguish between 
>>external, internal and whatever other kind of links there might be) and 
>>validate them.
>>You can also use the searches to do things you might do with XQuery 
>>(again, I don't know XQuery...). For example, in our CMS we have the 
>>concept of page regions. Content pieces are assigned to folder/page 
>>regions. Say I want to find out where a content piece has been assigned. 
>>I can run a query on all assignments to return references to the 
>>pages/folders where it has been assigned. You can do searches for all 
>>users in a particular group, all projects that a user has access to, 
>>etc.. etc...
>     Which is why I'd propose defining a full-text schema language,
>     so XML content can be described to a full-text search engine.

It does sound very interesting. How would it work? What would it look 
like? I have tried doing this with XML Schema but gave up. I had tried 
to use annotations to give weight to different things, then I tried to 
make a type system. For me, it was just easier to write java to handle 
it. Now I write org.xml.sax.ext.DefaultHandler2's that suit my needs. I 
know, not very scalable or user friendly.


>     The langauge would permit ranking based on markup, define what
>     constitues a document, what constitutes a document collection, etc.
> --
> Alan Gutierrez - alan@engrm.com
>     - http://engrm.com/blogometer/index.html
>     - http://engrm.com/blogometer/rss.2.0.xml


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS