[
Lists Home |
Date Index |
Thread Index
]
Fredrik Lindgren wrote:
> Please explain how the parser would both give the
> TypedContentHandler it's typed data and fulfill the
> expectations of the ContentHandler interface with regard
> to elements and attributes.
Actually, I can't explain how this would work because it
wouldn't. I was wrong when I wrote: "TypedContentHandler works exactly
like ContentHandler except that it can do just a little bit more." The
problem is that the "little bit more" that I mentioned would sometimes
mean "just a little bit less."
As I finally came to understand after a three hour long, but
very enjoyable, dinner with Elliote Rusty Harold (excellent chicken
BTW!), the core problem with my proposal is the part about using the
"callbackCharacters" feature to "turn off" callbacks to
ContentHandler.characters(). This changes the behavior of the
interface in such a way that classes that derive from
TypedContentHandler don't see the expected behavior of the
ContentHandler.characters() method. This is not good. The contracts of
the ContentHandler interface should be met by interfaces that extend
it. Thus, TypedContentHandler.characters() should be called whenever
the SAX 2 specification says that ContentHandler.characters() should
be called. My proposal was syntactically legal in Java, it was simple,
it was many things... but it wasn't good design.
So, to do this "correctly" one could still extend
ContentHandler and create TypedContentHandler, however, the feature
flags would not be used. The result would be exactly what I proposed
but with both features turned on. A program that wanted characters
would catch the character() callback, and one that wanted values would
catch the values() callback. The XMLReader would make both callbacks.
Of course, this would be somewhat inefficient... But, it would keep
things working properly and would allow programmers to "speak in
types", which has some value. The result would probably be just about
the same as what people do today, calling an appropriate conversion
routine on the data passed by each call to characters.
However, the double callback would be very ugly, so, it would
probably be better to provide a new interface with a
"getValueFromCharacters" method that would return a typed equivalent
of whatever returned by characters(). This routine would rely on
information in the schema, etc. to make sure that conversions were
done properly and thus would be less error prone than what people do
today. Also, it wouldn't have to be called on every callback since
sometimes strings are what one wants. Doing it this way would mean
some improvements in performance over some of the alternatives and
should be considered "non-polluting" in any case.
Of course, you could get back to exactly the behaviour that I
was proposing simply by making TypedContentHandler a peer of
ContentHandler rather than an extension of it. You would copy
ContentHandler and add the values() method. This would serve many
situations well, and by *not* being an extension of ContentHandler
would prevent subclasses or extensions from getting confused. But,
then, you would be complicating the construction of SAX pipelines and
filter chains. Not good -- thus, probably shouldn't be pursued.
Anyway, enjoy and thanks for all the fish. Sorry 'bout the
diversion. I was wrong.
bob wyman
|