OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [xml-dev] XML Database Decision Tree?





> -----Original Message-----
> From: Leigh Dodds [mailto:ldodds@ingenta.com]
> Sent: Thursday, October 18, 2001 8:54 AM
> To: Magick, Brian; Ranjeet Sonone; xml-dev@lists.xml.org
> Subject: RE: [xml-dev] XML Database Decision Tree?
> 
> 
> 
> > -----Original Message-----
> > From: Magick, Brian [mailto:Brian.Magick@compaq.com]
> > Sent: 17 October 2001 20:41
> > To: Ranjeet Sonone; xml-dev@lists.xml.org
> > Subject: RE: [xml-dev] XML Database Decision Tree?
> >


> It's a fairly common XML design pattern to have a head-body structure
> to your schema [2]. e.g. metadata (author, title, etc) and 
> content (paras, markup, etc). Processing the XML to pull out the
information 
> in the head into standard relational tables, whilst leaving the content 
> as a BLOB or referenced file meets most requirements.
> 
> Generally speaking it's the head information that needs 
> indexing, will be most queried on, etc. It's usually least likely to
include 
> mixed content which makes the decomposition easier.

That's a good point.  Twisting it around to reflect my biases <grin>, I'd
put it something like: If you have "document" XML and the schema is easily
separated into easily normalizeable metadata and arbitrarily structured
text content, and most queries will be on the metadata, an XML-enabled RDBMS
that extracts some elements into tables and other elements into CLOBs may be
the most appropriate.  If you anticipate frequent queries on the actual
document structure and content, a native XML DBMS may be more appropriate.

For example, if you want to know the name of a play by Shakespeare that has
a character named "Puck", the RDBMS+CLOB approach will handle it; if you
need to know  which scenes in Shakespeare plays have lines in which the
character Puck says something about Oberon, you'll probably need the
features of a native XML DBMS.

Which reminds me of another "rule" I'd suggest:  If you need to do queries
that exploit the recursive structure of your data, a native XML DBMS will
tend to be more appropriate than a "raw" RDBMS.  For example, if you need to
know which products contain assemblies or sub-assemblies, or
sub-sub-assemblies ... that contain a particular part, XPath-based query
languages will handle it better than SQL-based languages.  (The "bill of
materials" query is a well-known challenge for SQL, but a piece of cake in
XPath).  I guess to be fair, I'd have to say "if you need to do queries that
involve joins across metadata in different collections, an RDBMS-based
approach will tend to be more appropriate than a native XML DBMS, at least
until some flavor of XQuery is widely supported."

Ya know, I'm beginning to see the outline of a decision tree in here after
all!