xml-dev - RE: [xml-dev] After XQuery, are we done?

RE: [xml-dev] After XQuery, are we done?

[ Lists Home | Date Index | Thread Index ]

To: "Gavin Thomas Nicol" <gtn@rbii.com>,"XML Developers List" <xml-dev@lists.xml.org>
Subject: RE: [xml-dev] After XQuery, are we done?
From: "Hunsberger, Peter" <Peter.Hunsberger@STJUDE.ORG>
Date: Wed, 27 Oct 2004 09:28:35 -0500
Thread-index: AcS7z3olkHgpkEF5QIyJTqDNzmbqqwAWdzuw
Thread-topic: [xml-dev] After XQuery, are we done?

Gavin Thomas Nicol <gtn@rbii.com> writes:
> 
> On Oct 26, 2004, at 10:02 AM, Hunsberger, Peter wrote:
> > Well, you have to serialize and de-serialize, yes, but you may have 
> > better ways of portraying graph structure.  In particular, id and 
> > idref gets a little painful if you're trying to do a lot of many to 
> > many mappings; you really want to normalize out the groupings of 
> > idrefs and use some  explicit form of sub-graphs.  XML 
> get's fragile 
> > very quickly when you've got multiple paths through the network, 
> > picking the right path for any given context requires extra 
> > meta-metadata that is hard to manage.
> 
> I wonder how much of the pain is XML and how much is the encoding of 
> your data? For example, if ID/IDREF becomes painful, couldn't you use 
> something else? One way or another your application will have to 
> interpret the data after all? Just wondering...
> 

We do use something else instead of id/idref.  In particular, we use
loose matching against element names, attribute names, attribute values
and sometimes text values.  Eg:

<layout type="grid">
	<a type="string"/>
	<b type="string"/>
	<c type="key"/>
</layout>
<data>
	<a dataId=1">some value 1</a>
	<a dataId=2">some value 2</a>
	<a dataId=3">some value 3</a>
	<b dataId=1">some value 4</b>
	<b dataId=2">some value 5</b>
	<b dataId=3">some value 6</b>
	<c dataId=1">123456</b>
	<c dataId=2">123457</b>
	<c dataId=3">123458</b>
</data>

If you do the matching via element names you've got to add extra
attributes in order to validate (schema level or back end data wise).
If you do the matching via attributes (<data name="a" dataId="1">some
value 1</data>) then it's perhaps easier to manage with the XML tools
but harder for the humans. Doesn't really matter to me, we use both
patterns, they both work.  

Changing this from a graph to a tree is trivial in this case (add a root
node). The issues start to get messier when you've got to add some cross
cutting aspect. For example, perhaps each piece of data comes from a
different source and you've got different manipulation privileges on
various data items.  So now, you've got to add an authorizations tree
and then you've got to add a data to authorizations mapping tree, and at
this point you no longer have something that's easy to understand or
manipulate.  If you've ever written an xPath like:   

	/root/authmapp/*[@name = current()/@name]/auth[@groupId =
/root/service/groupId][contains(text(),'Write')]

then really, what you are doing is graph manipulation and not tree
manipulation; jump back to the root (or some other arbitrary well
defined point) and match it up to the current node translates into "see
if there is an arc from some point X in the graph to this point".

If you've got to start to drag in external document references then
loose coupling via element or attribute names etc. starts to become
fragile.  If the documents are produced in different domains then you've
got to start spending resources ensuring that mappings even exist long
before you can start to make them robust.  (Maybe that's ok, that's
partly why we have jobs.)

The bottom line is that XML can't serialize graphs completely cleanly
(and nor would I expect it too). I'll try and find time to reply to
Rick's comments on the trees vs. graphs issue to take a blind stab at
what alternatives might look like (and whether we really need them).  

In the mean time, let me make it clear, I'm not unhappy with using XML
to get the job done.  It is a massive leap forward from the tools we've
had to do this in past.  I've said it before, but once more: the
standardization of algorithms and best practices for tree manipulation
is the real blessing XML has bestowed on the IT industry.  Regardless of
serialization and data format issues and strange legacy corner cases,
XML has done more to move the industry forward than almost any other
technology of the last 20 years simply because many, many individuals no
longer have to worry about things like coding a binary tree traversal
ever again.

Prev by Date: Re: XOM vs. dom4j
Next by Date: Re: [xml-dev] Partyin' like it's 1999
Previous by thread: Re: [xml-dev] After XQuery, are we done?
Next by thread: RE: [xml-dev] After XQuery, are we done?
Index(es):
- Date
- Thread