[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
RE: [xml-dev] What is the general direction you are seeing thesedays to store and query lots of large complex XML?
- From: "Costello, Roger L." <costello@mitre.org>
- To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
- Date: Thu, 5 Mar 2015 16:59:44 +0000
Hi Folks,
This is an outstanding discussion. Many questions have been raised. It would be good to collect together all the questions:
> How many XML files are to be stored and queried? How big are they?
There are 50 million XML files, each 50MB in size.
> What's the complexity of the XML: is there deep nesting or is it flat?
The files are mostly flat (not deeply nested).
> Are the XML files volatile or static?
The XML files are relatively static - a few are updated for errors but most stay the same.
> Are there requirements for further processing or consuming them as XML
> elsewhere or are they just a query source?
The XML files are just a query source. The results of the queries on the XML documents are used as input to SAS and SPSS analytics.
> What type of queries, with what frequency?
We want multiple people to query multiple times a day. Right now the query frequency is low because the queries take days to run.
> What kind of queries do you will need to perform? Full text queries? XPath? XQuery?
The queries are done using XPath and XQuery.
> Do you know or care what the document vocabularies are?
The XML elements and attributes are very well known. The structure of the XML is well known.
Question: What is your recommendation for storing and querying this huge amount of XML?
/Roger
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]