[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [xml-dev] XML Database Decision Tree?
- From: "Champion, Mike" <Mike.Champion@SoftwareAG-USA.com>
- To: xml-dev@lists.xml.org
- Date: Thu, 18 Oct 2001 09:50:51 -0400
> -----Original Message-----
> From: Leigh Dodds [mailto:ldodds@ingenta.com]
> Sent: Thursday, October 18, 2001 8:54 AM
> To: Magick, Brian; Ranjeet Sonone; xml-dev@lists.xml.org
> Subject: RE: [xml-dev] XML Database Decision Tree?
>
>
>
> > -----Original Message-----
> > From: Magick, Brian [mailto:Brian.Magick@compaq.com]
> > Sent: 17 October 2001 20:41
> > To: Ranjeet Sonone; xml-dev@lists.xml.org
> > Subject: RE: [xml-dev] XML Database Decision Tree?
> >
> It's a fairly common XML design pattern to have a head-body structure
> to your schema [2]. e.g. metadata (author, title, etc) and
> content (paras, markup, etc). Processing the XML to pull out the
information
> in the head into standard relational tables, whilst leaving the content
> as a BLOB or referenced file meets most requirements.
>
> Generally speaking it's the head information that needs
> indexing, will be most queried on, etc. It's usually least likely to
include
> mixed content which makes the decomposition easier.
That's a good point. Twisting it around to reflect my biases <grin>, I'd
put it something like: If you have "document" XML and the schema is easily
separated into easily normalizeable metadata and arbitrarily structured
text content, and most queries will be on the metadata, an XML-enabled RDBMS
that extracts some elements into tables and other elements into CLOBs may be
the most appropriate. If you anticipate frequent queries on the actual
document structure and content, a native XML DBMS may be more appropriate.
For example, if you want to know the name of a play by Shakespeare that has
a character named "Puck", the RDBMS+CLOB approach will handle it; if you
need to know which scenes in Shakespeare plays have lines in which the
character Puck says something about Oberon, you'll probably need the
features of a native XML DBMS.
Which reminds me of another "rule" I'd suggest: If you need to do queries
that exploit the recursive structure of your data, a native XML DBMS will
tend to be more appropriate than a "raw" RDBMS. For example, if you need to
know which products contain assemblies or sub-assemblies, or
sub-sub-assemblies ... that contain a particular part, XPath-based query
languages will handle it better than SQL-based languages. (The "bill of
materials" query is a well-known challenge for SQL, but a piece of cake in
XPath). I guess to be fair, I'd have to say "if you need to do queries that
involve joins across metadata in different collections, an RDBMS-based
approach will tend to be more appropriate than a native XML DBMS, at least
until some flavor of XQuery is widely supported."
Ya know, I'm beginning to see the outline of a decision tree in here after
all!