OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SAX2 ... missing features?

From: "David Brownell" <david-b@pacbell.net>

> SAX2 has been out for a while, and I'm curious what folks'
> pet peeves are.  Please share!  I'd hope that some of these
> could turn into (backwards compatible) updates.

David, here is my list, again in no particular order

-New Features:
* concatenate-characters
Setting this feature would cause ContentHandlers to concatenate contiguous
characters (including resolved entities resulting in character data) into a
single call to characters().  This must be a very common requirement, one
that applications have to manage themselves at present (and some probably
are unaware of the need).  This could also be achieved with a filter - see

* preserve-systemIds
Setting this feature would prevent the SAX Driver from making system
identifiers absolute before calling the EntityResolver.  This has been
previously suggested and is registered in SourceForge #434478

-New Filter Classes
I wonder why the XMLFilter lumps all the core interfaces together into one
filter.  Would it not make more sense to follow the typical Java i/o model
and create Filter classes for each type of Handler?  This would allow
applications to have finer-grained control, and it would offer the same
facilities for SAX extensions.  For example, I could foresee a use for a
CharacterConcatenationFilter which implements ContentHandlerFilter,
concatenating contiguous characters as described above.

-Provide streaming interfaces for comments and PIs
Currently the SAX Driver is forced to buffer the comment/PI text which can
be arbitrarily large.

-Provide raw content model and internal entity values
The DeclHandler provides useful information to create DTD documentation.
Additional pieces of information that is missing however are the raw values
(where %PE;s are unexpanded) of content model and internal entity values.
This is used to good effect in other DTD documentation utilities such as
Normal Walsh's DTDParse [1].

-Namespace mapping
The current design of ContentHandler (startPrefixMapping/endPrefixMapping)
requires that the application maintain a stack of  scoped namespace contexts
(probably by using NamespaceSupport).  Each call to startPrefixMapping must
push a prefix/uri pair onto the stack, and calls to endPrefixMapping should
pop the stack.  It would be easier for applications if there was an
additional method called changePrefixMapping which provided the latest
prefix/uri mapping.  In this way the application could simply maintain a map
rather than a stack.  Probably too late to do anything about this though ;-(

Rob Lugt
ElCel Technology

[1] http://www.nwalsh.com/perl/dtdparse/index.html