[
Lists Home |
Date Index |
Thread Index
]
8/2/02 12:54:53 PM, Mike Brown <mike@skew.org> wrote:
>Although you know the element contains a single string of character data,
>'Shipping Address (< 31)', a parser, acting on the lexical constructs it
>encounters, is likely, though not required, to report this as 3 calls to
>characters() in the case of SAX, or as 3 adjacent text nodes in the case of
>DOM. This is normal behavior. In SAX you just have to live with it. In DOM you
>may have the option of normalizing the text beforehand; check the DOM specs
>and docs for your parser.
Correct in the case of SAX; the spec explicitly says that characters() may deliver its results in
chunks. Incorrect in the case of DOM; the spec says that a parser must represent contiguous text as
a single node when creating a DOM and that adjacent text nodes in a DOM can only arise out of
subsequent manipulation.
|