OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Question for the XPath and DOM folks

[ Lists Home | Date Index | Thread Index ]

> 7/20/2002 4:28:44 PM, Uche Ogbuji <uche.ogbuji@fourthought.com> wrote:
> >> 
> >> "The XPath model relies on the XML Information Set [XML Information set]
> >> ands represents Character Information Items in a single logical text node
> >> where DOM may have multiple fragmented Text nodes due to cdata sections,
> >> entity references, etc. Instead of returning multiple nodes where XPath sees
> >> a single logical text node, only the first non-empty DOM Text or
> >> CDATASection node of any logical XPath text will be returned in the node
> >> set. 
> >
> >Yikes!   This is a *very* *very* bad job.  Luckily that spec is still a WD and 
> >I hope they'll fix it before release.  If they can't do better than that then 
> >they should just leave DOM/XPath interaction to application specifics.
> Well, it was a VERY VERY VERY bad job for the "W3C" (if one can think of it
> as a unified entity rather than a collection of working groups  made up
> of competitors, loosely coordinated by the staff and director) 
> to have created the situation where there are multiple,
> inconsistent data models defined by various XML-related Recommendations.

Agreed.  Of course I shudder to think of how the simplicity and elegance of 
XPath would have been marred by a full-blown DOM model.

I thin DOM should have done what SAX did.  Level one handles the 80/20: 
elements/attributes/text.  Add everything else as optional higher levels.  
Most E/A/T-based specs have compatible data models, so I suspect there would 
have been less problem.

> More importantly, the W3C has learned from this mistake, and I don't
> think it would happen under the current organization and process.

I suppose this "learning" is what is causing the XQuery/XPath meld?  If so, 
I'm not sure there is a net gain from learning such a lesson specs influence 
each other to become *more* complex.

> In the long run my personal (and official corporate, FWIW) position is that the
> data models MUST be reconciled, even at the cost of some backwards incompatibility.
> ("Re-breaking the bone so that it can heal cleanly" is my favorite metaphor here).
> In the short run, it's not at all clear what is to be done.  I/we do not want
> to hold DOM Level 3 hostage to this, however, because it could take awhile ....
> DOM Level 3 provides basically 2 ways to deal with this:  Load-time options to
> create an "InfoSet" view with no CDATA sections and unexpanded entity references,

This is a good start.  Basically, it's adding the more modest profile I talk 
about above, but ex post facto.  Not ideal, but the best of a bad pick.  It 
does sharply reduce my objection to XPath/DOM.  I as an implementor would 
probably nd up *mandating* this Infoset view on DOMs on which the user chooses 
to call XPath APIs.

> and the XPath interfaces to allow one to essentially translate between the XPath
> view of a document and the DOM view of a document.  The key point is that an
> XPathResult doesn't return "a" node, it returns a way of iterating across the
> DOM view of the nodes corresponding to the XPath view of the nodes.
> That's what the "manually gather" bit means here:

Ah.  Mike Olson, who has recenty been looking into DOML3/XPath (he's always 
been our DOM champion) mentioned this, and he was very excited about it.  It 
translates *extremely* well to Python 2.2, where you can end up just returning 
generator objects from the XPath data model, with *huge* efficiencies to be 
gained thereby.

> >> Applications using XPath in an environment with fragmented text nodes
> >> must manually gather the text of a single logical text node possibly from
> >> multiple nodes beginning with the first Text node or CDATASection node
> >> returned by the implementation."
> Just to make life more interesting, there's a couple more issues to wrestle with:
> how to map the XPath "nodes have a namespace property" view onto the DOM "namespace
> declaration nodes upwards in the tree define the namespace a node is in" view;

Yeah.  Fix DOM.  It's broken.  DOM L2's treatment of namespaces is one of the 
most egregious examples of design-by-committee that I've ever seen.  I can 
almost reproduce the committee discussion in my head by reading the spec.  
Member A says "damn it!  Namespaces are local properties of the node.  
Period".  Member B says "Damn it!  We signed up to handle XML 1.0 not thei 
newfangled namespaces thing.  Namespace declarations are just special 
attributes that can be interpreted with the XMLNS pixie dust semantics by 
apps, if they so choose."  The committeee is deadlocked, so both views, 
regardless of the fact that they are contradictory, are directly supported.  
The result is DOM Level 2.


I must say, Mike.  Knowing you, I bet you fought for a simpler resolution than 
what resulted.  This is one example of why I'd like the W3C veil of secrecy 
abolished.  I want to be able to flame whoever led us into this mess.  Of 
course, this is precisely why the veil is never likely to be lifted  ;-)

> and
> how to deal with the fact that the XPath data model is in flux.

Again simple.  Stick to XPath 1.0.

Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Track chair, XML/Web Services One Boston: http://www.xmlconference.com/
The many heads of XML modeling - http://adtmag.com/article.asp?id=6393
Will XML live up to its promise? - http://www-106.ibm.com/developerworks/xml/li


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS