[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [xml-dev] XML Database Decision Tree?
- From: Eric Lemond <elemond@neocore.com>
- To: xml-dev@lists.xml.org
- Date: Wed, 31 Oct 2001 10:10:44 -0700
I can give a real example of loading data into NeoCore XMS. This was a
project one of our engineers did to prove that we could handle huge data
sets. It's not meant as a benchmark, but to illustrate data loading
simplicity.
We got a copy of the 44.1 GB GenBank of genomics research. We converted
the documents to XML with a small Perl script. Each document is an
average of 200 MB in size.
Using the command: neoxmlutils import [config dir location] [import
dir]
The resulting database footprint was 34.4 GB (<80% the size of the
original data). You don't have to create any indexes. With our pattern
processing technology, the database is fully indexed.
Just wanted to confirm how easy it is to load data into an XML Database.
Eric Lemond
-----Original Message-----
From: Tom Bradford [mailto:bradford@dbxmlgroup.com]
Sent: Tuesday, October 30, 2001 5:38 PM
To: Champion, Mike
Cc: PaulT; xml-dev@lists.xml.org
Subject: Re: [xml-dev] XML Database Decision Tree?
On Tuesday, October 30, 2001, at 05:21 PM, Champion, Mike wrote:
> If by "efficiently" you mean human time rather than computer time,
this
> can
> be demonstrated by comparing what it takes to load something like the
> Shakespeare plays into various DBs of one flavor or another and
> performing
> some XPath queries. With Tamino (the only one I know how to do this
in
> offhand) the steps are:
>
> 1 - load the DTD (or schema) into the Schema editor (tweak content
> model to
> allow variations and evolution and define indexes if you must)
> 2 - Define a DB collection based on that schema (2 mouse clicks or
so)
> 3 - Use a simple HTML form or a loader script to load the XML data
into
> the
> DB
> 4 - Enter the URL of the database + "_xql=" + an XPath expression
You big companies and your silly GUI tools.
I'll follow this up with how it would be done in a dbXML scenario (all
of these are typed from the shell):
1> dbxmladmin ac -c /db -n newcollection # Creates
'newcollection'
2> dbxml addmultiple -c /db/newcollection -f ./ # Adds
the documents
3> dbxml xpath -c /db/newcollection -q <some xpath> # Queries the
collection
From a user/admin point of view, the process is brain-dead simple.
The
only other steps you might want to take are to add indexes to the
collection (best done after a load, just as with RDBSes)
dbxmladmin ai -c /db/newcollection -n index1 -p elementName
dbxmladmin ai -c /db/newcollection -n index2 -p elementName@attrName
dbxmladmin ai -c /db/newcollection -n index3 -p *@attrName
dbxmladmin ai -c /db/newcollection -n index4 -p elementName@*
etc...
-- Tom
-----------------------------------------------------------------
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of OASIS <http://www.oasis-open.org>
The list archives are at http://lists.xml.org/archives/xml-dev/
To subscribe or unsubscribe from this elist use the subscription
manager: <http://lists.xml.org/ob/adm.pl>