[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] DOM versus XDM: Differences in handling CDATA sections,entities, and concurrency
- From: Michael Kay <mike@saxonica.com>
- To: xml-dev@lists.xml.org
- Date: Fri, 12 Nov 2010 17:03:54 +0000
DOM and XDM are not the only models of XML: there's also JDOM, DOM4J,
XOM, etc. They are all different in various respects. Also, most of them
have options and levels. Also, DOM is defined as an API whereas XDM is
defined as an object model. So XDM doesn't discuss API-oriented issues
such as thread-safety.
Schema processors often work in pure streaming mode, without building a
tree representation of the document in memory.
The XSLT 2.0 specification is defined in terms of XDM, but actual
products may use any of these models underneath, performing the
necessary mappings (e.g. expanding entity references and masking CDATA
sections) as required. Many XSLT processors have their own internal tree
model which will have some kind of relationship to the XDM model used in
the specification, but often not a direct representation - for example,
a naive implementation of the way XDM describes namespaces would be
horrendously inefficient.
Michael Kay
Saxonica
On 12/11/2010 16:38, Costello, Roger L. wrote:
> Hi Folks,
>
> My understanding is that an XML document is first processed by an XML parser, which creates an in-memory tree representation of the XML document. Then, an application such as an XML Schema validator or an XSLT processor operates on the in-memory tree representation. Here is a simple graphic I created to show this:
>
> http://www.xfront.com/DOM-versus-XDM/How-an-XML-document-is-processed.gif
>
>
> It is my understanding that the in-memory model created by different XML parsers may be different, depending on whether the XML parser creates a DOM or XDM in-memory model.
>
> Here are two places where differences arise:
>
> - CDATA sections
> - Entities
>
> Also, there are differences with respect to:
>
> - Concurrent access
>
>
> ------------------------------
> CDATA SECTIONS: DOM VERSUS XDM
> ------------------------------
>
> This XML document contains a CDATA section:
>
> <root>
> hello<![CDATA[if A< B then ...]]> world
> </root>
>
> As mentioned, there are two ways to model XML documents:
>
> - Document Object Model (DOM) [1]
>
> - XML Data Model (XDM) [2]
>
> The two models represent the above XML document differently:
>
> - A DOM tree will have a node for the CDATA section, as evidenced by
> the fact that the DOM API has a method for accessing CDATA sections [3].
> Here is a graphic I created to show the DOM tree for the XML document:
>
> http://www.xfront.com/DOM-versus-XDM/DOM-implementation-of-CDATA.gif
>
> - An XDM tree does not have a node for the CDATA section. The CDATA
> section is resolved; i.e., the contents of the CDATA section is
> merged with the surrounding text.
> Here is a graphic I created to show the XDM tree for the XML document:
>
> http://www.xfront.com/DOM-versus-XDM/XDM-implementation-of-CDATA.gif
>
>
> Notice that in the DOM tree there are three nodes under the Element node, whereas in XDM there is only one Text node under the Element node.
>
> ------------------------------
> ENTITIES: DOM VERSUS XDM
> ------------------------------
>
> This XML document uses an entity:
>
> <root>
> hello if A< B then ... world
> </root>
>
> DOM and XDM represent entities differently:
>
> - A DOM tree will have a node for the entity, as evidenced by
> the fact that the DOM API has a method for accessing entities [4].
> Here is a graphic I created to show the DOM tree of the XML document:
>
> http://www.xfront.com/DOM-versus-XDM/DOM-implementation-of-entities.gif
>
> - An XDM tree does not have a node for the entity. The entity
> is resolved; i.e., the entity is replaced by its replacement
> text and is merged with the surrounding text.
> Here is a graphic I created to show the XDM tree of the XML document:
>
> http://www.xfront.com/DOM-versus-XDM/XDM-implementation-of-entities.gif
>
>
> Notice that in the DOM tree there are three nodes under the Element node, whereas in XDM there is only one Text node under the Element node.
>
> ---------------------------------
> CONCURRENT ACCESS: DOM VERSUS XDM
> ---------------------------------
>
> There are occasions where multiple applications (processes) need to operate on the same in-memory tree. Recently, Hans-Juergen Rennau reported [5] problems with concurrent access to DOM trees. He found no problems with concurrent access to XDM trees.
>
> I then learned [6] that the DOM specification does not require implementations provide a thread-safe DOM API; i.e., it does not require that concurrent access to a DOM tree be properly synchronized.
>
>
> QUESTIONS
>
> 1. Is the above description and graphic of how XML documents are processed correct?
>
> 2. Is the above description and graphics of the differences in how CDATA sections and entities are represented in DOM and XDM correct?
>
> 3. Is the above description of the differences in thread-safety of DOM and XDM correct?
>
> 4. Will applications behave differently depending on whether the XML parser it uses generates DOM or XDM? If so, isn't that really bad?
>
> 5. Do XML Schema validators use DOM or XDM to represent the XML Schema and the XML instance document?
>
> 6. If I were to create my own XML Schema validator, do I have the option of choosing to use DOM or XDM? Or, does the XML Schema specification require me to use one of them? If so, which one?
>
> 7. Do XSLT processors use DOM or XDM to represent the XSLT document and the XML instance document?
>
> 8. If I were to create my own XSLT processor, do I have the option of choosing to use DOM or XDM? Or, does the XSLT specification require me to use one of them? If so, which one?
>
> 9. For each of the following products, does it use DOM or XDM?
>
> XML Schema Validators
>
> - XERCES: DOM or XDM?
>
> - SAXON: DOM or XDM?
>
> - XSV: DOM or XDM?
>
> - MSXML: DOM or XDM?
>
> - LIBXML: DOM or XDM?
>
> XSLT Processors
>
> - XALAN: DOM or XDM?
>
> - SAXON: DOM or XDM?
>
> - MSXML: DOM or XDM?
>
> - XSLTPROC: DOM or XSD?
>
>
> /Roger
>
>
> [1] http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html
>
> [2] http://www.w3.org/TR/xpath-datamodel/
>
> [3] http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html#ID-667469212
>
> [4] http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html#ID-11C98490
>
> [5] http://sourceforge.net/mailarchive/forum.php?thread_name=4CDBE667.2050400%40saxonica.com&forum_name=saxon-help
>
> [6] http://sourceforge.net/mailarchive/message.php?msg_name=4CDBE667.2050400%40saxonica.com
>
>
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]