Lists Home |
Date Index |
- To: "Rick Marshall" <firstname.lastname@example.org>,"Michael Champion" <email@example.com>
- Subject: RE: [xml-dev] Designing XML to Support Information Evolution
- From: "Hunsberger, Peter" <Peter.Hunsberger@STJUDE.ORG>
- Date: Fri, 21 May 2004 10:30:45 -0500
- Cc: "xml-dev DEV'" <firstname.lastname@example.org>
- Thread-index: AcQ+xelEkEHx6dVfQre49U9ysOiT0wAgJt1A
- Thread-topic: [xml-dev] Designing XML to Support Information Evolution
Rick Marshall <email@example.com> writes:
> hierarchies fail, and this is my struggle with xml at the
> moment, when
> they have to support multiple hierarchies simultaneously. and they
> largely fail because of a) the update problem, and b) the new
> problem. reverse bill of materials is a case in point.
> having said that xml works really well where neither of these are an
> issue - documents where the "semantics" don't change only the
> and as i said before moving transactions between systems.
> even relational systems have problems because the semantics
> is embedded
> in the sql select statements. most so called post relational systems
> (not really sure that's a legitimate term, even though it's
> used a lot)
> basically embed semantics back into the structure.
> things like owl and to a lesser extent name spaces try to express the
> semantics as a meta model. imho a far superior approach. i just don't
> like naming relationships - prefer to acknowledge they exist
> and what it
> takes to define them, but not necessarily name them.
> now to xml and the cinderella id tag. the same effect as the
> hierarchical xml could be achieved by allowing a name/value
> pairing to
> store the structure as attributes in the xml tag and they should be
> treated as elements as well.
> the id tag is the required unique key, while special
> associate elements
> store structure. this has the advantage of flatenning the xml and
> allowing the parsers to create structure on the fly to suit
> the translators.
> <home id="456"><home_elements/></home>
> <person id="123"><associate
> which would be approximately
> <home id="456">
> <person id="123">
> early days, but something like this would be much better for data
> modelling. perhaps we can have post-xml? ;)
Interesting, this is essentially the structure I was comparing to a
structured hierarchy in the "Parallel tree traversal" thread. Turns out
that once I fixed up all my XSLT bugs and cleaned up the code that the
version that used the structured hierarchy runs about an order of
magnitude faster than the version that attempts to stitch the hierarchy
together from flat data using id/idref.
I need a little more testing on the insert/update side, but I expect I'm
going to proceed with a version of our code that can spit out multiple
hierarchies cutting across our relationship lattice on demand instead of
trying to glue this together on the XML side. More XML output
(redundant trees), but at least in our case normalization costs too much
in terms of performance and the extra space consumption can be handled:
the redundant data is generated only as needed from a normalized
database and not persisted anywhere. It chews up app server memory, but
we're talking at most maybe 100 MB (if every model gets cached, which in
our case will happen over time). A GB of memory is cheap enough that
once more, throwing hardware at an XML problem trumps trying to spend
too much time optimizing it.
More and more, I'm seeing that XML application optimization comes down
to explicitly exploiting the known algorithms for fast tree traversal
and generation and not trying to re-invent normalization from within
XSLT (or Java transforms for that matter)...