[
Lists Home |
Date Index |
Thread Index
]
Jeff Rafter wrote:
> I am curious if anyone has done any work on an XML enabled repository.
> The main features I am looking for are an XmlDiff and/or possibly
> canonicalization. The idea is that I am working with a group of
> distributed developers all plugging away on different editors-- some of
> which add in their own comments when you open a file or add editor
> specific info (especially when working with XML Schemas). So my thought
> was that it might be useful if XML was stored in the repository in a
> canonical form so that diff'ing it would be more reliable.
>
> Bonus features would be XQuery/XPath support built in... so that queries
> could be run against either all or portions of the repository.
>
> I am sure others on this list have experience in this area of holding
> XML in a repository... any suggestions?
(random thought)
How about this: use Apache's Lucene (java -
http://jakarta.apache.org/lucene/docs/index.html) to store your XML.
There are at least two ways of doing this:
- simply run all the content through a sax ContentHandler to initially
populate the search index.
- create a serializable XML object and serialize it to the the search
index. (and of course this or another index could be used to store other
serializable objects)
When indexing, you could add a field like 'xpath' at the element level
that stores the path to it. One problem is that you need to delete a
document in the index to update it :(
Version the search index manually or automatically.
Run a search query to get document matches and shoot the document out as
sax events.
What do you think?
best,
-Rob
>
> Thanks,
> Jeff Rafter
|