OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: filtering noise (was Re: SAX LexicalHandler::comment issue)

On 06 Jul 2001 07:53:23 -0700, David Brownell wrote:
> > > To me, it comes down to not wanting to be stuck with the
> > > syntactic sugar DOM insists on.  I don't see attributes as
> > > being in that category, since they hold real data.  I'd rather
> > > just not spend the memory.
> > 
> > That doesn't strike me as a problem of the DOM - it strikes me as a
> > processing problem that hasn't been well-solved.
> Well, not consistently.  We are in a maze of twisty little passages,
> all different ... :)  There are plenty of solutions to this one.

Sure.  But everyone seems to point fingers at everyone else's processing
models, rather than looking at parsing architectures themselves.
> > The DOM (and Infoset, IMHO) need to be able to represent everything XML
> > 1.0 offers.  People who need less should be able to turn those things off.
> I'd turn that around:  people who need more (in DOM) should be able
> to turn them on.  Core APIs should bias towards simplifying; it's easy to
> add complexity later (likely even inevitable), but you can't add simplicity
> after-the-fact.

The DOM folks certainly could have created Core and Extra APIs for XML -
maybe it's time to go back and do that.  JDOM took a different approach
to adding simplicity after-the-fact.  It might be easier just to build a
SAXFilter that kills all comment, PI, CDATA section, and/or ignorable
whitespace events, with properties to assist developers who do or don't
want some of those aspects of XML reported.  Then you could have your
DOM and enjoy it too.

Of course, XML itself is a grand example of adding simplicity after the
fact (of SGML), so maybe there's time yet.

> > Unfortunately, no one seemed to like the
> > controlled-streaming-into-a-tree model at the time these things started,
> > and now we've just got pileups.
> I think there were plenty of folk who liked it, it's just that they
> weren't the ones calling the shots ... :)


> One thing to keep in mind is that DOM came out of the
> "Dynamic HTML in JavaScript" world, which didn't start
> out as a decent (systems) programming language.  The
> browser DOM implementations couldn't easily adopt such
> models.

Heh.  My first book was on Dynamic HTML, so I appreciate the evil

Nonetheless, I don't think the problem is that browsers were incapable
of handling such models - the problem comes from assuming that there a
browser infrastructure building the DOM, and not exploring that build
process thoroughly.