Lists Home |
Date Index |
At 9:51 AM -0500 3/8/04, Karl Waclawek wrote:
>> Being able to rely on
>> startDocument()/endDocument() in the ContentHandler allows all the
>> initialization and tear-down code to easily go in the same class as
>> the code that fills the data structure. It's all neatly unified.
>Why could it not go in the same class in the other case?
If the ContentHandler doesn't have any initialization or cleanup
methods (or at least any reliably invoked ones) then it can't do the
initialization or cleanup. Something else has to do it. you could add
cusotm initialization or clean up methods and then have the something
else call these:
But that's still ugly and less than ideal. As I teach my students, if
certain public methods must be invoked in a certain order, then
something is wrong. They should be made private and combined into one
public method. Each public method call should be atomic and
independent of other public methods. Having the ContentHandler do its
own initialization and cleanup makes the code clean and robust.
Relying on others to do it makes the code ugly and brittle. It's
analogous to the difference between programming in a language like C
with explicit memory allocation and deallocation and a language like
Java automatic memory management. Both will get the job done, but
one's a heck of a lot cleaner and less bug prone.
Oh, it just hit me why startDocument() is not an adequate replacement
for endDocument(). There's often work you want to do at the end of a
parse irrespective of whether there's a next document or not. For
instance, you might want to store the results in a database
somewhere, or update some other variable. The purpose of
endDocument() is not solely to clean up any data structures that were
used. We need both startDocument() and endDocument(), not just one.
Yes, they may not be named precisely correctly, but we do need them,
and not being able to rely on them is a major hassle.
Elliotte Rusty Harold
Effective XML (Addison-Wesley, 2003)