[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
DOM versus XDM: Differences in handling CDATA sections, entities,and concurrency
- From: "Costello, Roger L." <costello@mitre.org>
- To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
- Date: Fri, 12 Nov 2010 11:38:56 -0500
Hi Folks,
My understanding is that an XML document is first processed by an XML parser, which creates an in-memory tree representation of the XML document. Then, an application such as an XML Schema validator or an XSLT processor operates on the in-memory tree representation. Here is a simple graphic I created to show this:
http://www.xfront.com/DOM-versus-XDM/How-an-XML-document-is-processed.gif
It is my understanding that the in-memory model created by different XML parsers may be different, depending on whether the XML parser creates a DOM or XDM in-memory model.
Here are two places where differences arise:
- CDATA sections
- Entities
Also, there are differences with respect to:
- Concurrent access
------------------------------
CDATA SECTIONS: DOM VERSUS XDM
------------------------------
This XML document contains a CDATA section:
<root>
hello <![CDATA[if A < B then ...]]> world
</root>
As mentioned, there are two ways to model XML documents:
- Document Object Model (DOM) [1]
- XML Data Model (XDM) [2]
The two models represent the above XML document differently:
- A DOM tree will have a node for the CDATA section, as evidenced by
the fact that the DOM API has a method for accessing CDATA sections [3].
Here is a graphic I created to show the DOM tree for the XML document:
http://www.xfront.com/DOM-versus-XDM/DOM-implementation-of-CDATA.gif
- An XDM tree does not have a node for the CDATA section. The CDATA
section is resolved; i.e., the contents of the CDATA section is
merged with the surrounding text.
Here is a graphic I created to show the XDM tree for the XML document:
http://www.xfront.com/DOM-versus-XDM/XDM-implementation-of-CDATA.gif
Notice that in the DOM tree there are three nodes under the Element node, whereas in XDM there is only one Text node under the Element node.
------------------------------
ENTITIES: DOM VERSUS XDM
------------------------------
This XML document uses an entity:
<root>
hello if A < B then ... world
</root>
DOM and XDM represent entities differently:
- A DOM tree will have a node for the entity, as evidenced by
the fact that the DOM API has a method for accessing entities [4].
Here is a graphic I created to show the DOM tree of the XML document:
http://www.xfront.com/DOM-versus-XDM/DOM-implementation-of-entities.gif
- An XDM tree does not have a node for the entity. The entity
is resolved; i.e., the entity is replaced by its replacement
text and is merged with the surrounding text.
Here is a graphic I created to show the XDM tree of the XML document:
http://www.xfront.com/DOM-versus-XDM/XDM-implementation-of-entities.gif
Notice that in the DOM tree there are three nodes under the Element node, whereas in XDM there is only one Text node under the Element node.
---------------------------------
CONCURRENT ACCESS: DOM VERSUS XDM
---------------------------------
There are occasions where multiple applications (processes) need to operate on the same in-memory tree. Recently, Hans-Juergen Rennau reported [5] problems with concurrent access to DOM trees. He found no problems with concurrent access to XDM trees.
I then learned [6] that the DOM specification does not require implementations provide a thread-safe DOM API; i.e., it does not require that concurrent access to a DOM tree be properly synchronized.
QUESTIONS
1. Is the above description and graphic of how XML documents are processed correct?
2. Is the above description and graphics of the differences in how CDATA sections and entities are represented in DOM and XDM correct?
3. Is the above description of the differences in thread-safety of DOM and XDM correct?
4. Will applications behave differently depending on whether the XML parser it uses generates DOM or XDM? If so, isn't that really bad?
5. Do XML Schema validators use DOM or XDM to represent the XML Schema and the XML instance document?
6. If I were to create my own XML Schema validator, do I have the option of choosing to use DOM or XDM? Or, does the XML Schema specification require me to use one of them? If so, which one?
7. Do XSLT processors use DOM or XDM to represent the XSLT document and the XML instance document?
8. If I were to create my own XSLT processor, do I have the option of choosing to use DOM or XDM? Or, does the XSLT specification require me to use one of them? If so, which one?
9. For each of the following products, does it use DOM or XDM?
XML Schema Validators
- XERCES: DOM or XDM?
- SAXON: DOM or XDM?
- XSV: DOM or XDM?
- MSXML: DOM or XDM?
- LIBXML: DOM or XDM?
XSLT Processors
- XALAN: DOM or XDM?
- SAXON: DOM or XDM?
- MSXML: DOM or XDM?
- XSLTPROC: DOM or XSD?
/Roger
[1] http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html
[2] http://www.w3.org/TR/xpath-datamodel/
[3] http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html#ID-667469212
[4] http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html#ID-11C98490
[5] http://sourceforge.net/mailarchive/forum.php?thread_name=4CDBE667.2050400%40saxonica.com&forum_name=saxon-help
[6] http://sourceforge.net/mailarchive/message.php?msg_name=4CDBE667.2050400%40saxonica.com
- Follow-Ups:
- Re: [xml-dev] DOM versus XDM: Differences in handling CDATA sections,entities, and concurrency
- From: Martin Honnen <Martin.Honnen@gmx.de>
- Re: [xml-dev] DOM versus XDM: Differences in handling CDATAsections, entities, and concurrency
- From: Amelia A Lewis <amyzing@talsever.com>
- Re: [xml-dev] DOM versus XDM: Differences in handling CDATA sections,entities, and concurrency
- From: Martin Honnen <Martin.Honnen@gmx.de>
- Re: [xml-dev] DOM versus XDM: Differences in handling CDATA sections,entities, and concurrency
- From: Martin Honnen <Martin.Honnen@gmx.de>
- Re: [xml-dev] DOM versus XDM: Differences in handling CDATA sections,entities, and concurrency
- From: Michael Kay <mike@saxonica.com>
- Re: [xml-dev] DOM versus XDM: Differences in handling CDATA sections,entities, and concurrency
- From: David Carlisle <davidc@nag.co.uk>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]