[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] XML aggregation question?
- From: Robert Koberg <rob@xxxxxxxxxx>
- To: xml-dev@xxxxxxxxxxxxx
- Date: Sun, 27 Aug 2006 11:30:50 -0400
>>>> So I have been sticking with the filesystem,
>>> Yes - I am a great believer in the filesystem as a simple database engine.
>>>
>>>> Apache's Lucene for indexing
>>> Yes
>>>
>>>> and CVS or Subversion for version control.
>>> Yes
>> From the code point of view, it works so nicely. Easy to develop on a
>> local machine just by doing a server commit and local update. Easy to
>> change on the server by doing a local commit and a server update (and
>> probably an Ant build). Same for content/metadata if necessary to change
>> in different locations -- just need to run a lucene update to keep the
>> UI experience in sync.
>
> Well, herein lies the crux of Mike's comments about if you don't start
> with a DB that's likely what you'll end up building to some degree.
I guess 'to some degree' is the main qualifier. Let's look at it:
- filesystem: this is perhaps the simplest of them all and probably does
not need comment.
----
- Lucene: this library is one of the most clean and easiest to
use/implement. If you are dealing with XML (i.e. not wanting to index
tags/attribute names), though, you might need to write some code (SAX
preferably or even some DOMish thing) to handle indexing appropriately.
This can be used to tune the index as well. For example, you have some
description-type fields that allow em(phasis) and strong (emphasis) -
you can weight those terms/phases more and more heavily than the
not-applicable-to-search and/or phatic text. Can you even do this type
of thing in an XML DB or RDB? (unless you use something like Lucene?)
You will also want to determine whether you store the indexed data in
the index (for quick/easy retrieval) or store references to pick back up
from the filesystem. (this is where an XML DB has its most charm to me)
----
- CVS/Subversion (oh, and when I say subversion, I mean using the
filesystem backend, not the Berkeley DB backend (licensing again)): this
is pretty much a no-brainer as well (I assume?). This allows you to
checkout anywhere where you give access and create an instant work, QA,
gold-master or runtime environment.
If you need something like version control in your app it will be much
easier with something that is filesystem based rather than held in some
binary.
------------------------
Since this an XML list, I will assume we are talking about XML. I will
also assume the main benefit of a DB is transactions (I don't think
eXist has that yet). But that can be handled by code at the level of a
wellformedness check and a validity check.
How does using a (XML)DB from the start compare?
best,
-Rob
> If
> I was managing "real" documents or providing more of a content
> management system, this would more than likely work. However, I want to
> be able to "slice and dice" my XML instances to provide different views
> or ways of accessing the instances based on values of specific
> attributes or elements.
>
> As I said originally, if I didn't care about wanting to keep the data in
> XML as the "native" format of the system for easy editing by hand (in
> particular for me, using vi on Linux) as well as providing more GUI
> based view/edit capabilities via a Web-based interface (most probably
> using XForms), I'd just forget about the XML aspect and it would be a
> "traditional" RDBMS application.
>
> Based on all the comments thus far as well as reading some of the
> articles/documentation on eXist, it would seem that an XML database is
> really the only viable choice if I want to keep my data as XML and still
> provide aggregated views across the instances based on values of
> attributes (or other expressions using XPath and/or XQuery).
>
> If I went with the "traditional" RDBMS approach, I'd be spending most of
> my application's CPU cycles going to and from XML, so the benefits of
> being able to use SQL to pull the list of instances really doesn't seem
> worth it. At the moment, I'm leaning towards trying eXist to see how
> well it'll work for what I want to do.
>
> Again, thanks for all the discussion so far. There's likely to be some
> additional comments during the week, so my decision's far from set in
> stone yet.
>
> Cheers,
>
> ast
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]