OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: Attribute normalisation and character entities

[ Lists Home | Date Index | Thread Index ]
  • From: David Brownell <david-b@pacbell.net>
  • To: Richard Tobin <richard@cogsci.ed.ac.uk>
  • Date: Thu, 27 Jan 2000 15:00:58 -0800

Richard Tobin wrote:
> 
> How is an attribute containing a character reference to to whitespace
> character (other than space) supposed to be normalised?
> 
> Section 3.3.3 seems to me to say that character references are not
> subject to the translation to #x20 - the four bulleted points are
> an exhaustive disjunction.
>
> However the Oasis test suite, in tests sa02 and not-sa02, requires
> that they are replaced with spaces.
> 
> Which is correct?

As a data point, those output tests were originally generated using
the then-current version of XP.  I suspect Tom Passim's observation
is close:  except for CDATA, _whitespace_ should be replaced with just
one space.

As I've commented elsewhere, I find that much of the entity processing
in the XML spec seems to be specified as a collection of special cases
(updated via errata as inconsistencies turn up) rather than being based
on simple and consistent rules.  This is another place that it seems to
be happening.

There are two curious points in 3.3.3 ... first, that character and
entity refs may appear, and second that CRLF sequences may appear (line
endings already having been normalized).

How would these appear?  If we assume that 4.4 applies first, then
those OASIS cases are correct, and they'd appear "doubly escaped" as:

   <element
	char-ref-attr = "foo &#38;#9; bar"
	ent-ref-attr1 = "AT&#38;amp;T"
	ent-ref-attr2 = "AT&amp;amp;T"
	crlf-attr     = "a&#xD;&#xA;b"
	/>

If we assume that 3.3.3 has needless duplication of 4.4 then I
can't see how the literal CRLF can ever show up as input to the
normalization, since line-ends have already been normalized.

On the other hand, I don't think anyone actually writes what
ent-ref-attr2 has -- "AT&amp;T" is it.  Perhaps 4.4 applies
first, _and_ there is needless duplication (for entity refs).
Or 3.3.3 has both duplication and several errors.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN 981-02-3594-1
Unsubscribe by posting to majordom@ic.ac.uk the message
unsubscribe xml-dev  (or)
unsubscribe xml-dev your-subscribed-email@your-subscribed-address

Please note: New list subscriptions now closed in preparation for transfer to OASIS.






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS