XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] Beware of XPath expressions that produce falsepositives

Hi Folks,

Thank you Rick and Dimitre.

This discussion has been excellent. Most notably, it has revealed that there are many factors to consider when defining “empty element”. Here is the list of factors:

Whitespace

If an element contains whitespace, should it be considered empty? For example, should the B element in this instance document be considered empty:

<Row>
   
<A>foo</A>
   
<B>   </B>
   
<C>bar</C>
</Row>

 

Suppose the element is preprocessed to strip whitespace. For example, if the element is processed by an XSLT program that contains the instruction

<xsl:strip-space elements="*"/>

 

then it will be impossible for subsequent instructions to even know that the element originally contained whitespace.

Comment

If an element contains a comment, should it be considered empty? For example, should the B element in this instance document be considered empty:

<Row>
   
<A>foo</A>
   
<B><!-- Hello, world --></B>
   
<C>bar</C>
</Row>

Processing Instruction

If an element contains a processing instruction, should it be considered empty? For example, should the B element in this instance document be considered empty:

<Row>
   
<A>foo</A>
   
<B><?my-pi x="blah"?></B>
   
<C>bar</C>
</Row>

Attribute

If an element contains an attribute, should it be considered empty? For example, should the B element in this instance document be considered empty:

<Row>
   
<A>foo</A>
   
<B x="10"/>
   
<C>bar</C>
</Row>

Namespace

If an element has a namespace declaration on the element, should it be considered empty? For example, should the B element in this instance document be considered empty:

<Row>
   
<A>foo</A>
   
<B xmlns:b="test"/>
   
<C>bar</C>
</Row>

 

Suppose there are no namespace declarations on the element, but there are in-scope namespaces; is the element empty?

Note that there will always be at least one in-scope namespace. Implicit on the root element of every XML document is this namespace declaration:

      <Row xmlns:xml="http://www.w3.org/XML/1998/namespace">

Suppose the element is bound to a namespace; is the element empty?

My Definition of Empty Element

An element is empty if and only if it has an element node and nothing else, and the only in-scope namespace is the xml namespace. Thus, the element has no attributes, no text node, no comments, no PIs, and no namespace declarations.

 

Here is a table showing examples of B elements that do and do not conform to my definition of empty:

 

 

Is empty?

<B/>

true

<B></B>

true

<B>&null;</B>

true

<B>  </B>

false

<B><!-- Hello, world --></B>

false

<B><bad/></B>

false

<B>99</B>

false

<B x="10"/>

false

<B xmlns:b="test"/>

false

 

XPath that Implements a Test for Empty Element

The following XPath, when applied to the examples shown in the table, will yield the desired results. Caveat: the whitespace has not been stripped by preprocessing (see the section titled Whitespace).

 

empty( (
             node(),
             @*,
             namespace::*[not(name() eq 'xml')]
             )
           )

Yikes!

For such a basic task—define empty element—that is a lot to take into consideration.

 

/Roger



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS