[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] DOM versus XDM: Differences in handling CDATAsections, entities, and concurrency
- From: Amelia A Lewis <amyzing@talsever.com>
- To: xml-dev@lists.xml.org
- Date: Fri, 12 Nov 2010 15:15:41 -0500
Okay. A part of my response is misleading (well, and the rest of it is
nasty, but I can live with nasty more than misleading).
On Fri, 12 Nov 2010 12:36:02 -0500, Amelia A Lewis wrote:
> On Fri, 12 Nov 2010 11:38:56 -0500, Costello, Roger L. wrote:
>> DOM and XDM represent entities differently:
This is true.
>> - A DOM tree will have a node for the entity, as evidenced by
>> the fact that the DOM API has a method for accessing entities [4].
This is sort of true.
>> Here is a graphic I created to show the DOM tree of the XML document:
That's wrong.
Now.
DOM has two node types related to entities. In the Node interface,
these types are indicated by ENTITY_NODE and ENTITY_REFERENCE_NODE
constants, when you ask them Node.getNodeType().
An Entity node will respond to getNodeName() (an entity has name), but
not to getNodeValue() or getAttributes(). An Entity node will respond
to getChildNodes(). Note that an Entity node does *not* represent a
character reference or a reference to a predefined entity. Entity
nodes have several additional methods (apart from the foregoing, which
are defined for Node, the base interface): getInputEncoding(),
getNotationName(), getPublicId(), getSystemId(), getXmlEncoding(),
getXmlVersion(). It should be clear from these methods that an entity
in DOM represents something relatively complex (like notations, or
external parsed and unparsed entities, or even parameter entities).
From the Javadoc:
"An XML processor may choose to completely expand entities before the
structure model is passed to the DOM; in this case there will be no
EntityReference nodes in the document tree.
XML does not mandate that a non-validating XML processor read and
process entity declarations made in the external subset or declared in
parameter entities. This means that parsed entities declared in the
external subset need not be expanded by some classes of applications,
and that the replacement text of the entity may not be available. When
the replacement text is available, the corresponding Entity node's
child list represents the structure of that replacement value.
Otherwise, the child list is empty."
An EntityReference node will respond to getNodeName() (an entity
reference has a name, which is the name of the referenced entity), but
not to getNodeValue() or getAttributes(). It has no extension
methods. An EntityReference refers to some defined Entity.
From the Javadoc:
"EntityReference nodes may be used to represent an entity reference in
the tree. Note that character references and references to predefined
entities are considered to be expanded by the HTML or XML processor so
that characters are represented by their Unicode equivalent rather than
by an entity reference. Moreover, the XML processor may completely
expand references to entities while building the Document, instead of
providing EntityReference nodes. If it does provide such nodes, then
for an EntityReference node that represents a reference to a known
entity an Entity exists, and the subtree of the EntityReference node is
a copy of the Entity node subtree. [...]" (the elision represents a
special case; let's deal with the primary case first)
Note that all of the above is corner case stuff anyway.
Now, if you happen to want to deal with this ... mess ... in XDM, some
provision is made.
In the XDM, a Document node will respond to two accessors that no other
node responds to:
xs:string dm:unparsed-entity-public-id(node, string)
xs:string dm:unparsed-entity-system-id(node, string)
Both of these are actually *defined* on all seven node types, but you
only get anything useful from the document node (everything else
returns empty sequence). These two functions/accessors provide access
to a property defined on the Document node. unparsed-entities. Just in
case it wasn't clear, here again the entities in question are unparsed;
unlike the DOM, if you wanna do something with them, it's up to you to
go and parse them.
Now ... possibly this will help. I have some doubts, but since the
original posting wanted to draw a distinction between the DOM vs XDM
handling of entities, it might be worthwhile to have wasted half an
hour or so looking things up in this fashion.
I'll add that the initial posting shows a fairly severe confusion
between specification and implementation. While it is true that the
specifications in question have different implementation profiles and
constraints, I'm not at all certain that most of the questions asked
make any sense outside the context of a specific implementation of each
of the specifications, in a host language.
Amy!
(whose random .signature generator appears to be in a *puckish* mood)
--
Amelia A. Lewis amyzing {at} talsever.com
Being your slave, what should I do but tend
upon the hours and times of your desire?
I have no precious time at all to spend,
nor services to do, till you require.
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]