[
Lists Home |
Date Index |
Thread Index
]
At 5:54 PM -0800 1/5/04, Jeff Rafter wrote:
>This is one of those questions that are more for curiousity than anything,
>but does anyone have any information on why ignorableWhitespace was included
>in ContentHandler as opposed to LexicalHandler? Based on my understanding of
>the guidelines used in determining what belongs in the default interfaces
>and what belongs in the extension interfaces it seems to fall under the
>latter. It is non-imperative lexical information associated with the parse.
>Comments?
That is incorrect. XML parsers must report all content, ignorable or
otherwise. It is not optional to report this content, unlike, for
example, CDATA section boundaries. The word "ignorable" is an
unfortunate choice here. It means the application receiving the data
may choose to ignore it. However, the parser cannot ignore this
content. It must provide it.
It's also the case that a lot of white space many people think is
ignorable really isn't. White space is only really ignorable if
there's a DTD, and even then you may choose not to ignore it. I
prefer the less loaded term "boundary white space" which identifies
all white space only text nodes, not just those that are ignorable.
--
Elliotte Rusty Harold
elharo@metalab.unc.edu
Effective XML (Addison-Wesley, 2003)
http://www.cafeconleche.org/books/effectivexml
http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA
|