[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: CDATA vs. EMPTY
- From: Peter Flynn <peter@silmaril.ie>
- To: xml-dev@lists.xml.org
- Date: Mon, 09 Jul 2001 00:19:26 +0100
On Sun, 08 Jul 2001, Bjoern Hoehrmann wrote:
> Hi,
>
> XML 1.0 SE says: "An element with no content is said to be empty".
> Does the following fragment have any content?
>
> <elem><![CDATA[]]></elem>
>
> The Recommendation further reads: "The representation of an empty
> element is either a start-tag immediately followed by an end-tag, or an
> empty-element tag". This is true for the fragment in it's canonical
> representation.
This is an unfortunate side-effect of the ability of XML to be
used without a DTD. In order to permit this, elements which
would otherwise have been declared EMPTY (if there had been a
DTD in operation) had to have a distinguished form, for which
the NET trick was initially used, in order to keep compatibility
with SGML (the only software around at the time). Thus <elem/>
therefore meant "I am an element which would have been declared
EMPTY if there had been a DTD".
It was only a short step from there to some people saying that
there was no significant difference between <elem></elem> and
<elem/> (in the absence of a DTD). A minority differed, feeling
that the first form implied that there *might* be content in some
circumstances, whereas the second form meant quite definitely
that there never could be content, and that this was a
distinction worth preserving.
We now have a position where elements with declared content are
represented in documents in the NET form on those occasions
when they happen -- perhaps by chance -- not to have content.
The element is frequently seen both with content and in NET
form in the same document instance, both with and without DTDs
or Schemas.
While this is possibly not important for non-persistent or
trivial applications, it is probably suboptimal for persistent
documents, as it creates an unnecessary inconsistency which may
not easily be explained to future users.
In answer to your question: yes, your example does have content,
but it does not have character data content. The direct
equivalence of <elem/> with <elem></elem> only holds when the >
of the start-tag is followed directly by the < of the end-tag.
Absence of evidence is not necessarily evidence of absence.
///Peter