xml-dev - Re: SAX: ignorable whitespace question

Re: SAX: ignorable whitespace question

[ Lists Home | Date Index | Thread Index ]

From: MURATA Makoto <murata@apsdc.ksp.fujixerox.co.jp>
To: DML Development List <xml-dev@ic.ac.uk>
Date: Mon, 03 Aug 1998 11:15:19 +0900

On Sun, 2 Aug 1998, Peter Murray-Rust wrote:

>                    XML (unlike HTML) does not normalise character content
> and all characters that are not markup are passed to the application.
> Ignorable whitespace is a device that SAX provides to help the application
> decide what action it may be able to take. If you are writing a SAX-based
> application you will need to understand this concept.

I think that CR, LF, or CR+LF are always normalized into LF. 

Eric Prud'hommeaux wrote:
> In that regard, it would seem that text is handled differently from
> system identifiers and attribute values.

As for attribute values, we do have different normalization.  As 
for systems identifiers, I do not understand your point.

> How about leading and trailing whitespace, or tags with just
> whitespace? For example, is "<tag>some  text\r\n\t</tag>" reported
> completely as characters and not split into characters("some  text")
> and ignorable("\r\n\t")? 

Right.  They are not split.

>Is the whitespace in "<t1>\n  <t2/>\n</t1>"
> ignorable? 

If 1) the DTD is available, 2) the element type t1 has an element content, and 
3) an XML processor uses the DTD to distinguish element content and mixed content, 
then the whitespace in <t1> is ignorable.

>I also assume from the XML spec that SAX is acting in the
> role of XML processor and must translate \r's so it would really be
> characters(" some  text\n\t").

Quite.

Makoto

Fuji Xerox Information Systems

Tel: +81-44-812-7230   Fax: +81-44-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

References:
- Re: SAX: ignorable whitespace question
  - From: "Eric Prud'hommeaux" <eric@w3.org>

Prev by Date: XSchema: email problems
Next by Date: Re: xml-dev Digest V1 #62
Previous by thread: Re: SAX: ignorable whitespace question
Next by thread: XSchema: email problems
Index(es):
- Date
- Thread