OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [xml-dev] XML Database Decision Tree?

I can give a real example of loading data into NeoCore XMS.  This was a
project one of our engineers did to prove that we could handle huge data
sets.  It's not meant as a benchmark, but to illustrate data loading

We got a copy of the 44.1 GB GenBank of genomics research.  We converted
the documents to XML with a small Perl script.  Each document is an
average of 200 MB in size.

Using the command:  neoxmlutils import [config dir location] [import

The resulting database footprint was 34.4 GB (<80% the size of the
original data).  You don't have to create any indexes.  With our pattern
processing technology, the database is fully indexed.

Just wanted to confirm how easy it is to load data into an XML Database.

Eric Lemond

-----Original Message-----
From: Tom Bradford [mailto:bradford@dbxmlgroup.com] 
Sent: Tuesday, October 30, 2001 5:38 PM
To: Champion, Mike
Cc: PaulT; xml-dev@lists.xml.org
Subject: Re: [xml-dev] XML Database Decision Tree?

On Tuesday, October 30, 2001, at 05:21 PM, Champion, Mike wrote:
> If by "efficiently" you mean human time rather than computer time,
> can
> be demonstrated by comparing what it takes to load something like the
> Shakespeare plays into various DBs of one flavor or another and 
> performing
> some XPath queries.  With Tamino (the only one I know how to do this
> offhand) the steps are:
> 1 - load the DTD (or schema) into the Schema editor (tweak content 
> model to
> allow 	variations and evolution and define indexes if you must)
> 2 - Define a DB collection based on that schema  (2 mouse clicks or
> 3 - Use a simple HTML form or a loader script to load the XML data
> the
> DB
> 4 - Enter the URL of the database + "_xql=" + an XPath expression

You big companies and your silly GUI tools.

I'll follow this up with how it would be done in a dbXML scenario (all 
of these are typed from the shell):

1> dbxmladmin ac -c /db -n newcollection                       # Creates

2> dbxml addmultiple -c /db/newcollection -f ./                # Adds 
the documents
3> dbxml xpath -c /db/newcollection -q <some xpath>  # Queries the 

 From a user/admin point of view, the process is brain-dead simple.
only other steps you might want to take are to add indexes to the 
collection (best done after a load, just as with RDBSes)

dbxmladmin ai -c /db/newcollection -n index1 -p elementName
dbxmladmin ai -c /db/newcollection -n index2 -p elementName@attrName
dbxmladmin ai -c /db/newcollection -n index3 -p *@attrName
dbxmladmin ai -c /db/newcollection -n index4 -p elementName@*

-- Tom

The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of OASIS <http://www.oasis-open.org>

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this elist use the subscription
manager: <http://lists.xml.org/ob/adm.pl>