OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: CDATA sections in W3C XML Infoset


> -----Original Message-----
> From: Bob Kline [
> Sent: Saturday, March 31, 2001 5:23 PM
> To: Charles Reitzel
> Cc: xml-dev@lists.xml.org; Tim Bray
> Subject: Re: CDATA sections in W3C XML Infoset
> I hope you're right.  Doesn't appear to be a universally held point of
> view, though.  From earlier in this thread [1]:
> > I'd take that as meaning that the DOM does not conform to the
> > Infoset spec.  Accordingly, the DOM is what needs to be changed, not
> > Infoset.

I can assure you (unofficially, but after having participated in several
DOM WG discussions on this matter) that the DOM plans to support CDATA sections for the
forseeable future.  They are needed by XML editors and databases;
the DOM API is widely used by both, so the CDATA section support will remain
as long as they remain in the XML serialization format.

The DOM *will* change to accomodate the InfoSet by a parse-time option to throw
away CDATA section markers, probably by some yet-to-be determined mappings from the
v 2.0 XPath/XSLT and Query data models to the more "raw syntax" data model in
the DOM, some reconciliation between the syntactical representation of
namespace declarations as attributes and their more abstract representation in
the InfoSet, and so forth. 

I don't REALLY think there is all that much disagreement here.  CDATA sections are
a bit of a mess to use even at the pure text level; they're useful for escaping
blocks of non-wellformed content, but dangerous because the content might contain
the character strings that delimit CDATA sections.  Used carefully, they are useful
in certain limited circumstances (such as Bob Kline's application) but I've
heard very little demand for them to be supported by XPath/XSLT, XQuery,
Schema, etc.  Thus the InfoSet folks chose to leave them out.  I *hope* that
Mr. Cowan's quote means something like "better for the DOM to figure out how to
peacefully co-exist with XPath/Query/Schema than for the other specs to have
to wrestle with the raw syntax stuff that the DOM has to deal with."

I'd remind people once again of the Common XML Usage Guidelines at 
http://simonstl.com/articles/cxmlspec.txt  It is sortof like an
ancient map of the XML world, with the "Common XML Core" identifying the
civilized world and all sorts of "here be dragons" notations
denoting the Terra Incognita of Interoperability.  The InfoSet is, to
a certain extent, the W3C's admission of the truths behind Common XML -- the
parts of XML syntax that it doesn't include in the abstract data model are
among the most dragon-infested regions of XML-space, especially CDATA sections
and entity references!

[If this reply sounds a bit schizophrenic, it's because my inner minimalist hates
CDATA sections and hope they die a painful death in XML 2.0, but my outer
DOM/day job personna sees all the uses they have in the real world today].