OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] What is the general direction you are seeing thesedays to store and query lots of large complex XML?

Hi Folks,

This is an outstanding discussion. Many questions have been raised. It would be good to collect together all the questions: 

>	How many XML files are to be stored and queried? How big are they?

There are 50 million XML files, each 50MB in size.

>	What's the complexity of the XML: is there deep nesting or is it flat?

The files are mostly flat (not deeply nested).

>	Are the XML files volatile or static?

The XML files are relatively static - a few are updated for errors but most stay the same.

>	Are there requirements for further processing or consuming them as XML 
>	elsewhere or are they just a query source?

The XML files are just a query source. The results of the queries on the XML documents are used as input to SAS and SPSS analytics.

>	What type of queries, with what frequency?

We want multiple people to query multiple times a day. Right now the query frequency is low because the queries take days to run.

>	What kind of queries do you will need to perform? Full text queries? XPath? XQuery?

The queries are done using XPath and XQuery.

>	Do you know or care what the document vocabularies are?

The XML elements and attributes are very well known. The structure of the XML is well known.

Question: What is your recommendation for storing and querying this huge amount of XML?


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS