XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Representing binary trees in XML … flat versus recursive

Hi Folks,

Suppose you want to represent binary trees in XML. Here are two ways to do it.

First, an example binary tree

Recursive Implementation

The root node consists of a value, an optional left subtree, and an optional right subtree.

A subtree is a node.

A node consists of a value, an optional left subtree, and an optional right subtree.

Here is a recursive XML representation of the above binary tree:

<binary-tree>
   
<root>
       
<value>root</value>
       
<left-subtree>
           
<value>A</value>
           
<left-subtree>
               
<value>C</value>
           
</left-subtree>
           
<right-subtree>
               
<value>D</value>
           
</right-subtree>
       
</left-subtree>
       
<right-subtree>
           
<value>B</value>
           
<left-subtree>
               
<value>E</value>
           
</left-subtree>
           
<right-subtree>
               
<value>F</value>
           
</right-subtree>
       
</right-subtree>
   
</root>
</binary-tree>

 

Here is the corresponding XML Schema:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
   
    
<xs:element name="binary-tree">
       
<xs:complexType>
           
<xs:sequence>
               
<xs:element name="root" type="node-type" />
           
</xs:sequence>
       
</xs:complexType>
   
</xs:element>
   
    
<xs:complexType name="node-type">
       
<xs:sequence>
           
<xs:element name="value" type="xs:string" />
           
<xs:element name="left-subtree" type="node-type" minOccurs="0" />
           
<xs:element name="right-subtree" type="node-type" minOccurs="0" />
       
</xs:sequence>
   
</xs:complexType>
   
</xs:schema>

 

Advantages:

Relative to the flat version, it is free of extra markup (the flat version requires additional attributes).

The XML Schema is simpler and shorter than the flat version.

Disadvantages:

Binary trees with many levels will result in deeply nested XML elements. Deep nesting could cause some XML parsers to fail and thus be exploited as an attack vector.

Flat Implementation

The root node consists of a value and it references an optional left subtree and it references an optional right subtree.

A subtree is a node.

References are accomplished using an ID/IDREF pair.

A node has a unique (ID) identifier and consists of a value and it references an optional left subtree and it references an optional right subtree.

Here is a flat XML representation of the above binary tree:

<binary-tree>
   
<root>
       
<value>root</value>
       
<left-subtree idref="node-A" />
       
<right-subtree idref="node-B" />
   
</root>
   
<node id="node-A">
       
<value>A</value>
       
<left-subtree idref="node-C" />
       
<right-subtree idref="node-D" />
   
</node>
   
<node id="node-B">
       
<value>B</value>
       
<left-subtree idref="node-E" />
       
<right-subtree idref="node-F" />
   
</node>
   
<node id="node-C">
       
<value>C</value>
   
</node>
   
<node id="node-D">
       
<value>D</value>
   
</node>
   
<node id="node-E">
       
<value>E</value>
   
</node>
   
<node id="node-F">
       
<value>F</value>
   
</node>
</binary-tree>

Here is the corresponding XML Schema:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
   
    
<xs:element name="binary-tree">
       
<xs:complexType>
           
<xs:sequence>
               
<xs:element name="root" type="node-type" />
               
<xs:element name="node" type="node-type" minOccurs="0" maxOccurs="unbounded" />
           
</xs:sequence>
       
</xs:complexType>
   
</xs:element>
   
    
<xs:complexType name="node-type">
       
<xs:sequence>
           
<xs:element name="value" type="xs:string" />
           
<xs:element name="left-subtree" type="subtree-type" minOccurs="0" />
           
<xs:element name="right-subtree" type="subtree-type" minOccurs="0" />
       
</xs:sequence>
       
<xs:attribute name="id" type="xs:ID" />
   
</xs:complexType>
   
    
<xs:complexType name="subtree-type">
       
<xs:simpleContent>
           
<xs:extension base="empty">
               
<xs:attribute name="idref" type="xs:IDREF" />
           
</xs:extension>
       
</xs:simpleContent>
   
</xs:complexType>
   
    
<xs:simpleType name="empty">
       
<xs:restriction base="xs:string">
           
<xs:length value="0" />
       
</xs:restriction>
   
</xs:simpleType>
   
</xs:schema>

 

Advantages:

The XML consists simply of a long (flat) series of node elements.

Disadvantages:

Relative to the recursive version, it requires extra markup (the ID/IDREF attributes).

The XML Schema is more complex and longer than the flat version.

Trivia

The Windows EXE file format stores some of its information in a binary tree [1] and the binary tree uses the flat form. [Of course, EXE files don’t use XML (EXE files are binary), but the approach of referencing the next level of the tree is in the same spirit as the flat XML described above.]

Questions

Are there other ways to represent binary trees in XML?

Of the two ways to represent binary trees shown above, which way is easier to process with XSLT? Which way is easier to process with other programming languages?

If you have used the recursive approach, have you encountered problems with XML parsers failing on deeply nested elements?

/Roger

[1] https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#the-rsrc-section



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS