[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] XML basics
- From: Amelia A Lewis <amyzing@talsever.com>
- To: Joe Fawcett <joefawcett@hotmail.com>
- Date: Tue, 1 Mar 2011 21:41:46 -0500
On Tue, 1 Mar 2011 08:07:49 +0000, Joe Fawcett wrote:
> Thanks for your comments, can you suggest a good term for a generic building
> block of XML if I can't use the term 'node'?
There really isn't one that's generally agreed upon. That's true for
'node', btw. Once you've created a tree, there's relatively little
argument that an element is a 'node', but, just for instance, in the
XQuery Data Model, it's possible to treat namespaces as not-nodes
(detail isn't all that relevant here, I think). Is text content a
node? How about ignorable whitespace? Comments, processing
instructions? Entity references? The XML declaration? The internal
subset? What about the contents of the internal subset (which aren't
quite XML, but do have those familiar pointy brackets). SAX has a
characters() event--how many of those make up a node? In the DOM,
there are namespace attributes (which are nodes); other APIs treat
namespaces and attributes as disjoint sets (and the XDM permits
namespace bindings to be treated as something approximately like
metadata on the tree, with no nodes available to navigate to).
Is a document a node? What's a document, then? Is an external parsed
entity a node?
DOM has 15 'node' types; the infoset has 11; XDM has 7 (this from
memory, so I might have fudged a number or two, but the point should
remain: the degree of variance indicates a rather slippery term, which
means that it's up to you to define what you mean by it).
For a book on XML basics, you might reasonably say that a common
programmatic representation of XML syntax in memory is as a tree of
nodes, but unless you want to descend into the swamp (keep in mind that
the XML Infoset spec came along after DOM and SAX and XPath and
attempted to unify these three very different models, along with other
inputs), it might be best to then innocently mention that what
syntactic elements define a node is not well-defined.
If you do achieve a definition ... what will it be? 'Node' in common
usage indicates participation in a graph--nodes and edges, nodes and
connections. But (according to some very popular APIs) there are nodes
that are not the children of their parents. There are also nodes that
are not visible in the syntax (if you accept that namespaces define
nodes this is easy to show: xml:lang="en_US" with no xmlns:xml
declaration).
The preferred programmatic and algorithmic representations of XML vary
both by usage and by the predilections of API designers, and a number
of terms (notably including 'node') are overloaded.
The *syntax* is core; it's well-defined by a fairly terse collection of
BNF in the base specification (which is usually amended by including
the namespaces spec, to our sorrow, as I have come to think). How that
information is defined for programmatic examination and manipulation
varies pretty widely, even among the W3C-produced specifications for
XML, and even keeping to a limited set of implementation languages. An
event-oriented API (SAX or StAX in Java, for instance) is a reasonable
next step. You probably don't want to ignore tree models in a book on
basics, but ... arm yourself, for there be dragons.
Amy!
--
Amelia A. Lewis amyzing {at} talsever.com
The less I seek my source for some definitive, the closer I am to fine.
-- Indigo Girls
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]