OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] attributes vs. elements

[ Lists Home | Date Index | Thread Index ]

> -----Original Message-----
> From: John Cowan [mailto:jcowan@reutershealth.com] 
> Sent: Friday, September 13, 2002 17:54
> To: Alessandro Triglia
> Cc: xml-dev@lists.xml.org
> Subject: Re: [xml-dev] attributes vs. elements
> Alessandro Triglia scripsit:
> > > From: mgushee@havenrock.com [mailto:mgushee@havenrock.com] 
> > >   * White space in elements can be preserved, whereas 
> white space in
> > >     attributes is normalized (leading and trailing spaces 
> are stripped,
> > >     each extent of internal white space is collapsed to a 
> single space
> > >     character).
> > 
> > I don't follow you.  If the schema is specified in XML 
> Schema and the
> > datatype of an attribute is "string" (whose "whitespace" facet is
> > "preserve" by default), I don't believe white space is normalized. 
> Whitespace normalization depends on the DTD, if any.  

If a schema is given, why would a DTD be needed?  Is there a reason why
one would want to add a DTD to an instance that is known to be
described/constrained by a schema?  (Also, what would guarantee that the
DTD and the schema are consistent with each other?)

> If an attribute
> is declared in the DTD as of type CDATA, or is not defined in 
> the DTD, or
> the parser does not read the relevant part of the DTD, then minimal
> normalization of its value is done: namely, tabs, newlines, 
> and CRs are
> changed to spaces.

I understand that this applies also when no DTD is present.  Is this
point non-controversial?  I mean, is there general agreement that
attribute values are to be normalized by XML processors even if DTDs are
not used at all?  I ask this question because it is difficult to
separate, in the XML 1.0 Recommendation, the DTD-specific prescriptions
from those regarding the basic behavior of XML processors.

> A CR can only exist in an attribute value by the
> use of Stupid DTD Tricks.
> If the attribute is declared in the DTD with a type other 
> than CDATA, and the
> parser knows it, then further normalization is done: leading 
> and trailing
> spaces are removed, and all other runs of spaces are 
> collapsed to a single
> space.

So if there is no DTD, this further normalization never occurs.  This
means that an XML Schema "string" will have its leading, trailing, and
multiple spaces preserved.  A "preserve" value for the white-space facet
effectively means "replace" when applied to attributes.  Correct?

Alessandro Triglia

> All this is done conceptually *before* any XML Schema 
> processing is done.
> Nothing analogous is done to character content, So:
> > an attribute of type "string" behaves differently from
> > an element of type "string" as to white space handling?
> Yes.
> -- 
> Knowledge studies others / Wisdom is self-known;      John Cowan
> Muscle masters brothers / Self-mastery is bone;       
> jcowan@reutershealth.com
> Content need never borrow / Ambition wanders blind;   
Vitality cleaves to the marrow / Leaving death behind.    --Tao 33

The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of OASIS <http://www.oasis-open.org>

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this list use the subscription
manager: <http://lists.xml.org/ob/adm.pl>


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS