OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xml-dev] DOM or SAX: Sense and Sensibility



On 01/11/07 10:58 PM, "PaulT" <pault12@pacbell.net> wrote:

> 
> ----- Original Message -----
> From: "Bob Hutchison" <hutch@xampl.com>
> 
>> What's the consequence of getting it wrong? Serious trouble. You end up
> with
>> slow, ugly, unmaintainable code. Worse, I've seen developers using the
>> resultant mess to avoid using XML altogether (we really are still in the
>> early days of XML). A little, real-life, example... Consider an XML based
>> specification of some events and their response (not SAX events, but
>> application events). The code that interprets the XML file was using a DOM
>> representation, and it wasn't great code -- someone's first attempt at
> using
>> the DOM wound up in the product (that's another issue).
> 
> I thunk that was not the developer's fault. In the absence of XPath
> in DOM, every developer had to re-invent the XPath
> wheel, because manual navigation in DOM three is masochistic.

I am not intending to blame the developer here. In fact I was quite pleased
he tried to do something using XML. The problem was he chose the wrong tool.
The resulting mess is not his responsibility.

> 
> I think that now, when DOM finally get's XPath, many people
> will just use XPath and even novice developer could write
> the code you're talking about so that it will be clean and tiny.
> 
> The code for the task you're talking about would be
> something like :
> 
> event = root.GetByXPath ("/config/events/event[name = 'event_name']");
> event_attribute = event.GetByXPath("./@attribute");
> 
> e t.c. XPath makes a *huge* difference. It is slow
> sometimes terribly slow (perl/Python) , but that should not be
> a problem for many applications.

This would have been better. I'm sure this is likely inconsequential
nit-picking, but the events are not known by name, they have to be iterated
over. The format of the event description is consistent.

Actually, you probably could have improved the DOM code significantly. But
it wasn't (and that's part of the point I think).

> 
>> The system I was working on has a different issue: neither SAX nor DOM is
>> quite right. You want an in-memory representation of a collection of
>> 'documents' but the DOM is too 'distant' -- the DOM is about the document,
>> we needed something else. I wound up writing an XML data binding (didn't
>> know it was called that then :-) This was *really* easy to get all the
>> developers in my group (about 17) using this.
> 
> XML Data Binding rocks, but DOM + XPath can replace
> it in many cases. DOM and XPath are still kinda fat, slow
> and 'standalone' ( they're not native entities for many
> of existing languages/tools, like 'Hashtable' is, for example )
> but that's minor problems, I think. Should work fine for many applications.

Personally, I think that data binding is a significant and ignored
technology. I understand why it is ignored -- it is felt to be not ready for
real use. This will be changing soon, and I think a lot of issues and their
solutions will have to be re-examined.

We (my group, not the group with the DOM problem) used the data binding tool
I wrote to build up an in-memory object structure that would live in a
web-server. Clients would communicate with this application using XML
messages (with their own data binding) that would trigger handlers in the
application. This in-memory representation was persited using XML. The size
(about 2000 classes, 1300 generated from about 800 XML elements in a dozen
or so namespaces -- but this is really misleading) and complexity of this
structure (it had lots of cyclic references) would pretty much preclude the
DOM + XPath route. Again the DOM is a meta description of something we
needed to implement more directly. XPath, it seems to me, takes advantage of
this distance.

> 
>> As a matter of interest, recently I've use the pull-parser XPP. I have a
>> suspicion that this reversal of control flow will make it easier for a lot
>> of people to work with event based processing. Personally, I think I still
>> prefer SAX.
> 
> And you use SAX for what? I guess to write your application-specific
> data binding.

Yes, initially. But for a lot of things :-) One of the more interesting
applications was using an XML document and SAX events to drive an
application for testing purposes. Someone could write up a script in XML and
have the system run it. It was actually much more sophisticated than I'm
making it sound -- it could record and playback sessions in real time or
sped up (maintaining event order but not necessarily inter-event time). We
got a way of automating acceptance tests (an XP shop) and regression tests.

> 
> Now consider, if you have DOM with XPath, would you bother
> translating ( manually, with SAX )
> 
> <doc title="title">
> <elements>
> <e id='1'>value</e>
> <e id='2'>value</e>
> </elements>
> </doc>
> 
> Into
> 
> class Doc {
>   Array getElements();
>   String getTitle()
> }
> 
> so that
> 
> title = doc.getTitle();
> elements = doc.getElements();
> 
> Or you'd just  do:
> 
> title = root.GetByXPath ("/doc/@title");
> elements = root.GetByXPath("/doc/elements/*");

Well, the data binding tool is more generic than you might think. It takes a
collection of XML documents (not schemas or DTDs) and generates java classes
from it. This is done using the generating tool automatically. The runtime
tool uses a couple of java lines like:

Builder builder = new Builder();
builder.build("file-name or the usual Java io suspects");
IDoc doc = (IDoc)builder.root();

At which point we have an in-memory representation of the document, where
you could write:

String title = doc.getTitle();
Iterator elements = doc.allElements();

I think this is starting to go off topic a bit, but I'd be happy to talk
about this on a more suitable thread :-)

> 
>> I could go on...
> 
> Would you? I found your letter to be very reasonable
> and pragmatic.

Thanks. This might be considered by some, like those that know me, to be a
dangerous thing to encourage :-)


Cheers,
Bob
> 
> Rgds.Paul.
>