[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: filtering noise (was Re: SAX LexicalHandler::comment issue)
- From: David Brownell <david-b@pacbell.net>
- To: xml-dev@lists.xml.org
- Date: Fri, 06 Jul 2001 15:26:18 -0700
"> == Mike.Champion@SoftwareAG-USA.com
> But I've also been answering "help!" questions about the DOM long enough to
> know that you can't assume that anyone will RTFM (RTFS?), and setting the
> default to throw away noise will inevitably lead to howls from people who
> *need* their <expletive deleted> comments, PIs, and CDATA sections.
I've been answering "help!" questions (in general; less recently for DOM :)
to know that howls come up regardless of whether or not you do the right
thing! So for such issues I look at other factors: which approach makes
better systems be easier to develop? Which wastes less memory? (That
can be be a real concern for DOM developers. I've seen "noise" costs
in the 20% range for some data models, though they vary wildly.)
> I don't think the DOM can take the lead here; either the InfoSet has to
> first define the difference between music and noise, or the XML Core folks
> have to deprecate the noise from XML syntax, and then the DOM can follow.
> ...
> So, I guess my answer is just as cowardly as John Cowan's explanation of why
> the InfoSet still represents the noise :~)
Yep ... nobody willing to take a stance about policy, beyond "enable all of
them". That's a symptom of organizations at certain points in their growth;
I've seen it (too) many times! There are much worse process outcomes,
but I'll still prefer better ones.
> Seriously, folks ... to paraphrase Clemenceu and Gen. Jack D. Ripper, "XML
> is too important to be left to the experts." The trouble with most people
> who work on these specs is not that they're stupid, but that they know too
> damn much about how this stuff works (and worked in SGML), how it really is
> useful under some circumstances, and how to ignore it when it's not useful.
> If y'all want simplicity, sanity, layering, modularity, etc. you're going to
> have to collectively put some feet to the fire, or maybe vote with your own
> feet.
I think there's a certain feeling that neither of those options seems to be
particularly viable with respect to W3C. (Consider that upcoming workshop
addressing the fact that lots of XML-ish specs don't seem to layer cleanly.)
On the other hand, I did (re)submit feedback this morning to the DOM WG
that noisy data representations shouldn't be the default, which is as much as
most of us are in a position to do.
- Dave