xml-dev - Re: [xml-dev] indexing and querying XML (not XQuery)

Re: [xml-dev] indexing and querying XML (not XQuery)

[ Lists Home | Date Index | Thread Index ]

To: Jason Hunter <jhunter@xquery.com>
Subject: Re: [xml-dev] indexing and querying XML (not XQuery)
From: Wolfgang Hoschek <whoschek@lbl.gov>
Date: Tue, 23 Aug 2005 21:23:25 -0700
Cc: XML Developers List <xml-dev@lists.xml.org>, Robert Koberg <rob@koberg.com>
In-reply-to: <430BC513.5010800@xquery.com>
References: <430B1F36.5030607@koberg.com> <DBEE72B5-8BB9-4855-81D6-17F7C3AFC311@lbl.gov> <430B7BE7.8080008@koberg.com> <B617C771-8F2D-45CC-8B22-999F3D6379A8@lbl.gov> <430BC513.5010800@xquery.com>

On Aug 23, 2005, at 5:53 PM, Jason Hunter wrote:

> Wolfgang Hoschek wrote:
>
>
>
>> If all you need is to index the document's flat text instead of  
>> XML  documents with their structure, that's straighforward, yes.
>> But presumably you'll want to combine structured search (e.g.  
>> XPath  navigation and predicates) with unstructured fulltext  
>> search, and now  you're in database terrain, wrt. choosing cost  
>> effective persistent  index data structures and execution plans  
>> for a mix of the main  expected queries/data types/access patterns/ 
>> read&write frequency,  plus static and/or dynamic query  
>> optimization, including materialized  view maintenance,  
>> transactional updates, etc. All well known  problems, now in the  
>> context of XML and fulltext search, but without  easy solutions  
>> nonetheless.
>> Wolfgang.
>>
>>
>
> In case people aren't aware, Mark Logic is doing exactly this.   
> They (er, we, as they're my current employer) combine indexed XPath  
> evaluation with full text search, scale beyond the limits of  
> memory, do the management of transactional updates, and so on.  I'm  
> glad to see discussion about this idea here because it's a very  
> cool one and something that I hope catches on widely as a meme.   
> More info and a free low-end version at http://xqzone.marklogic.com.
>
>

Jason, one question is how far "beyond the limits of [main] memory",  
and under what circumstances? How expressive is the fulltext search  
language? How *exactly* is it doing it? I find it difficult to  
believe that scheme xyz (including marklogic's) cannot easily be  
driven into resource exhaustion given the huge parameter space,  
unless working within a very carefully planned set of simplifying  
constraints and assumptions. In other words, an easy general-purpose  
solution doesn't exist in this area.

Wolfgang.

References:
- indexing and querying XML (not XQuery)
  - From: Robert Koberg <rob@koberg.com>
- Re: [xml-dev] indexing and querying XML (not XQuery)
  - From: Wolfgang Hoschek <whoschek@lbl.gov>
- Re: [xml-dev] indexing and querying XML (not XQuery)
  - From: Robert Koberg <rob@koberg.com>
- Re: [xml-dev] indexing and querying XML (not XQuery)
  - From: Wolfgang Hoschek <whoschek@lbl.gov>

Prev by Date: Re: [xml-dev] indexing and querying XML (not XQuery)
Next by Date: [xml-dev] : Create XML Data using PL/SQL
Previous by thread: Re: [xml-dev] indexing and querying XML (not XQuery)
Next by thread: RE: [xml-dev] indexing and querying XML (not XQuery)
Index(es):
- Date
- Thread