OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: White Space

[ Lists Home | Date Index | Thread Index ]
  • From: David Brownell <david-b@pacbell.net>
  • To: David Megginson <david@megginson.com>
  • Date: Mon, 16 Aug 1999 10:07:33 -0700

David Megginson wrote:
> arkin writes:
>  > A generic SAX parser has two methods of reporting character data, one
>  > clearly indicates that such character data is whitespace. What type of
>  > whitespace should be reported as whitespace? Can the application simply
>  > ignore whatever character data is reported as whitespace?
> The only whitespace reported that way is whitespace in element-only
> content: that means that there has to be a DTD, and the DTD has to say
> that an element can contain only other elements.  This is a reporting
> requirement for validating parsers from the XML 1.0 recommendation.

Hmm, the XML spec never quite seemed clear about that to me.  It didn't
quite include a definition of the term "ignorable whitespace".

What about an empty element "<EMPTY>  <!-- spaces!! --> </EMPTY>" ...
isn't that "ignorable" whitespace as well?  It "must be" passed to the
app, and clearly isn't regular character text.

FWIW I concluded "ignorable" whitespace is within elements that have a
content model that's not "ANY" or a mixed content model.  That is, it's
wherever normal characters can't appear.

>  > The XML specification clearly indicates some guidelines for handling
>  > white space in a consistent manner that saves the application developed
>  > from dealing with it, and will solve all of our problems (maybe except
>  > world hunger). Would it be reasonable to define two SAX parser layers,
>  > one before and one after the white space stripping?
> You can use the same API for both, but any whitespace stripping must
> be strictly at the application's discretion.

Where "application" is a fuzzy notion:  everything above the XML processor,
which could primarily consist of library code that doesn't want to give
such options to its callers.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS