xml-dev - UNSUBSCRIBE UNSUBSCRIBE UNSUBSCRIBE Re: [xml-dev] Question for the XPath

UNSUBSCRIBE UNSUBSCRIBE UNSUBSCRIBE Re: [xml-dev] Question for the XPath

[ Lists Home | Date Index | Thread Index ]

To: Uche Ogbuji <uche.ogbuji@fourthought.com>
Subject: UNSUBSCRIBE UNSUBSCRIBE UNSUBSCRIBE Re: [xml-dev] Question for the XPath and DOM folks
From: Edward Gloor <egloor@qwest.com>
Date: Tue, 23 Jul 2002 14:37:35 -0600
Cc: Dare Obasanjo <dareo@microsoft.com>, xml-dev@lists.xml.org
References: <E17W07g-0005zM-00@malatesta.local>

UNSUBSCRIBE

Uche Ogbuji wrote:

> > Given the following XML in a DOM document
> >
> > <foo>
> > bar
> > <![CDATA[
> > baz
> > ]]>
> > quux
> > </foo>
> >
> > and the following XPath
> >
> > //text()
> >
> > what should be the resulting DOM nodes and why? I can think of two answers but they both have problems.
> >
> >  PS: Why is http://www.w3.org/TR/2002/WD-DOM-Level-3-XPath-20020712/ returning a 404 when it is linked from  http://www.w3.org/DOM/ ?
> >
>
> XPath is defined against a certain model of an XML document.  The section that
> answers your question is 5.7:
>
> "Character data is grouped into text nodes. As much character data as possible
> is grouped into each text node: a text node never has an immediately following
> or preceding sibling that is a text node. The string-value of a text node is
> the character data. A text node always has at least one character of data.
>
> "Each character within a CDATA section is treated as character data. Thus,
> <![CDATA[<]]> in the source document will treated the same as &lt;. Both will
> result in a single < character in a text node in the tree. Thus, a CDATA
> section is treated as if the <![CDATA[ and ]]> were removed and every
> occurrence of < and & were replaced by &lt; and &amp; respectively."
>
> Therefore to a conforming XPath processor,
>
> <foo>
> bar
> <![CDATA[
> baz
> ]]>
> quux
> </foo>
>
> Is precesely the same as
>
> <foo>
> bar
> baz
> quux
> </foo>
>
> i.e. one element node with one text node child.
>
> There is actually an open bug against 4XPath right now that it leaks a bit in
> this performance.  e.g. in some cases, it can return a text node child of an
> attribute when operating on a DOM (this is so in DOM but not XPath).  Your pos
> is a handy reminder for me to fix this bug.
>
> As an illustration, here's a session with 4XPath does (interactive Python
> prompt):
>
> >>> DOC = """<foo>
> ... bar
> ... <![CDATA[
> ... baz
> ... ]]>
> ... quux
> ... </foo>"""
> >>> from Ft.Xml.Domlette import NonvalidatingReader
> >>> doc = NonvalidatingReader.parseString(DOC, "http://dummybaseuri.com";)
> >>> from Ft.Xml.XPath import Evaluate
> >>> result = Evaluate("//text()", contextNode=doc)
> >>> print result
> [<cText at 0x81ae434>]
> >>> print result[0].data
>
> bar
>
> baz
>
> quux
>
> >>>
>
> --
> Uche Ogbuji                                    Fourthought, Inc.
> http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
> Track chair, XML/Web Services One Boston: http://www.xmlconference.com/
> The many heads of XML modeling - http://adtmag.com/article.asp?id=6393
> Will XML live up to its promise? - http://www-106.ibm.com/developerworks/xml/li
> brary/x-think11.html
>
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>

--
Edward R Gloor
QWEST Communications
W - (303) 244-1348
P - (303) 852-8644

References:
- Re: [xml-dev] Question for the XPath and DOM folks
  - From: Uche Ogbuji <uche.ogbuji@fourthought.com>

Prev by Date: RE: [xml-dev] (newbie) common attribute through XML document
Next by Date: RE: [xml-dev] XML IDE: What are the top 3 and why?
Previous by thread: Re: [xml-dev] Question for the XPath and DOM folks
Next by thread: Re: [xml-dev] Question for the XPath and DOM folks
Index(es):
- Date
- Thread