[
Lists Home |
Date Index |
Thread Index
]
"Scherpenzeel, Wim" wrote:
> Now as I understood, whitespace is insignificant, provided there is DTD to
> specify to the Parser that there aren't any mixed content elements.
>
> Yet the following two documents yield different hash-totals, can anybody
> explain why?
>
> Document 1:
>
> <!DOCTYPE foo SYSTEM "E:\My Documents\xml\voorbeelden\DOMHash\foobar.dtd">
> <foo>
> <bar>blablabla</bar>
> </foo>
>
> Document 2:
>
> <!DOCTYPE foo SYSTEM "E:\My Documents\xml\voorbeelden\DOMHash\foobar.dtd">
> <foo><bar>blablabla</bar></foo>
>
> DTD:
>
> <!ELEMENT foo (bar)>
> <!ELEMENT bar (#PCDATA)>
According to the XML specification, all white space is significant --
that is, the processor (parser) must pass it to the application and let
the application decide what to do with it.
"Insignificant whitespace" is a concept introduced by SAX 1.0 for the
situation you described -- whitespace that appears between child
elements in an element known to have element content. While this makes
it easier for applications to process such whitespace, notice two
things: (1) the whitespace is still passed to the application, and (2)
SAX 1.0 is outside the XML 1.0 spec. It is therefore reasonable that
your documents yield different hash totals.
-- Ron
|