XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: Application of Postel's Principle to XML

Hi Folks,

 

I recently came across this statement in a requirement's document:

 

Verify using XPath that there are no elements or attributes

in an XML document with more than X amount of characters.

 

That seemingly simple requirement has ambiguities that senders and receivers must wrestle with. Here are the ambiguities:

 

1. Does "X amount of characters" include whitespace?

2. Can the XML contain mixed content?

 

Consider this XML document:

 

<root>

    <child>0123

        <grandchild>abc</grandchild>

    </child>

</root>

 

The <child> element has mixed content. Here is a tree diagram of the <child> element:

 

 

Which of the following should be considered the string length of the <child> element’s content:

 

(a) The string length is computed using the child text nodes:

 

string-length('0123    ') + string-length('     ')

 

(b) The string length is computed using all descendent text nodes:

 

string-length('0123    ') + string-length('abc') + string-length('     ')

 

(c)  The string length is computed using the non-whitespace child text nodes:

 

string-length('0123    ')

 

(d) The string length is computed using the non-whitespace child text nodes, after normalizing space:

 

string-length('0123')

 

(e) The string length is computed using all non-whitespace descendent text nodes, after normalizing space:

 

string-length('0123') + string-length('abc')

 

Conservative Sender

Postel’s principle says that a sender should be conservative in what it sends. What does it mean to be conservative in this case?

 

Conservative means that the XPath should find the longest string and verify that it has no "more than X amount of characters." For the example above, (b) represents the longest string. Thus, the XPath must verify that the string length of all descendent text nodes of an element does not exceed $x and the string length of each attribute values does not exceed $x. Clearly the root element of any XML document has the greatest number of descendant text nodes, so the XPath can simply check the root element and all attributes:

 

                (string-length(/*) le $x) and (empty(//@*[string-length() gt $x]))

 

Liberal Receiver

Postel’s principle says that a receiver should be liberal in what it receives. What does it mean to be liberal in this case?

 

Liberal means that for each element the XPath should check, after normalizing its child text nodes, the string length does not exceed $x characters. And, of course, the length of each attribute value does not exceed $x characters. Here’s the XPath:

 

                empty(//(*/normalize-space(), @*)[string-length(.) gt $x])

 

So if $x is 10, then receivers would accept this XML document:

 

<root>
    <child>0123
        <grandchild>abc</grandchild>
    </child>
</root>

 

but it would not be acceptable to senders.

 

I am interested in seeing other, actual requirements that are ambiguous and require senders and receivers to wrestle with.

 

/Roger

 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS