OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: filtering noise (was Re: SAX LexicalHandler::comment issue)


> -----Original Message-----
> From: David Brownell [
> Sent: Friday, July 06, 2001 12:18 PM
> To: Simon St.Laurent
> Cc: xml-dev@lists.xml.org
> Subject: Re: filtering noise (was Re: SAX
> LexicalHandler::comment issue)
> I'm not sure I'd say that's where DOM's bias towards "noise"
> nodes came from, but that might be it.

The DOM's bias toward's noise comes basically from the collective sense that if it's in XML syntax (and InfoSet, more recently), the DOM should be capable of loading, seeing, and manipulating it.  Also, if there's noise in a particular instance, the chances are that it's music to somebody's ears, or they wouldn't have put it in in the first place. 

I'm of two minds.  On one hand, I agree that noise should be discouraged, if not outright deprecated.  Setting DOM defaults to discourage noise makes sense from this perspective.

But I've also been answering "help!" questions about the DOM long enough to know that you can't assume that anyone will RTFM (RTFS?), and setting the default to throw away noise will inevitably lead to howls from people who *need* their <expletive deleted> comments, PIs, and CDATA sections. 

I don't think the DOM can take the lead here; either the InfoSet has to first define the difference between music and noise, or the XML Core folks have to deprecate the noise from XML syntax, and then the DOM can follow.  Until then, I think most people expect the DOM to present an API to XML as the XML folks define it, hideous warts and all.

So, I guess my answer is just as cowardly as John Cowan's explanation of why the InfoSet still represents the noise :~)

Seriously, folks ... to paraphrase Clemenceu and Gen. Jack D. Ripper, "XML is too important to be left to the experts."  The trouble with most people who work on these specs is not that they're stupid, but that they know too damn much about how this stuff works (and worked in SGML), how it really is useful under some circumstances, and how to ignore it when it's not useful. If y'all want simplicity, sanity, layering, modularity, etc. you're going to have to collectively put some feet to the fire, or maybe vote with your own feet.