OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] Handling of empty content and white space-only content?

[ Lists Home | Date Index | Thread Index ]

> If an element is declared to contain only other elements, then
> whitespace is not significant. If, however, an element is declared
> to contain parsable-character-data, then all whitespace is
> significant whether or not it comes directly after or directly before
> an element tag.

This is a popular interpretation of the spec, but it's not what it actually
says. The only place the spec uses the adjective "significant" in relation
to whitespace is in the first paragraph of 2.10:

In editing XML documents, it is often convenient to use "white space"
(spaces, tabs, and blank lines) to set apart the markup for greater
readability. Such white space is typically not intended for inclusion in the
delivered version of the document. On the other hand, "significant" white
space that should be preserved in the delivered version is common, for
example in poetry and source code.

This is clearly a rather informal introduction designed to motivate the
normative statements that follow in the next paragraph (and the use of
quotation marks should probably be read as "so-called"):

An XML processor MUST always pass all characters in a document that are not
markup through to the application. A  validating XML processor MUST also
inform the application which of these characters constitute white space
appearing in element content.

It's understandable that people should refer to "white space appearing in
element content" as "insignificant whitespace", but it's not a
formally-defined term.

I think it would be quite legitimate to refer to the first of the two spaces
between "A" and "validating" in the second extract above as "insignificant"
in the sense of the first paragraph, but there's certainly nothing in the
XML spec that causes it to be treated as such.

Incidentally, I hate the spelling of "white space" as two words. The term
does not mean "space that is white" (as distinct from space that is red or
green), therefore it should be a single word.

Michael Kay


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS