OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Namespaces, Xml Schema Whitespace normalization, xs:anyURI

[ Lists Home | Date Index | Thread Index ]
  • To: lists@jeffrafter.com, xml-dev@lists.xml.org
  • Subject: Re: [xml-dev] Namespaces, Xml Schema Whitespace normalization, xs:anyURI, and URILiterals in XPath 2.0
  • From: Michele Vivoda <idmichele@yahoo.it>
  • Date: Tue, 28 Mar 2006 18:11:06 +0200 (CEST)
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.it; h=Message-ID:Received:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=BR7ztaE/rtoSLyYI026hWEzVfpCoocbPK/b1EzxbOYRBXO/feL1DV0zSxqhEMZnz3kIx0NHh8jXswwOwqphmz2R2jfy0trEdbDK4e4N0LEFnVE7+0+FUG5yss5IpC47YM4xLLOvvaslb+9cyLinobhYm6gLuP66y39pOLWg4YXI= ;
  • In-reply-to: <442838A4.9070403@jeffrafter.com>

I will add some problems..


I think that spaces in URIs, as from the RFC,
are not allowed if they are significant, they
must be escaped as %20. If they are present 
in the characters of an URI they should  
be ignored, they are there just to allow 
to split the URI between multiple lines. 
(this part comes [historically] from the URL RFC).

So 

http://www.example.com/Example with two  spaces

is not a valid xs:anyURI or is no different from

http://www.example.com/Examplewithtwospaces

Perhaps right whitespace handling should be 
an hypothetical "REMOVE ALL" ?

In xs:anyURI lexical value spaces are allowed but
"discouraged unless specified as %20". I think
they cover different cases (significant/not)
with one subject.

There is no canonical value 
definition for xs:anyURI in the datatypes spec
to help in determining if spaces should be 
ignored or not.

This might be due to the fact that there is 
no such a thing as a normalized uri, at least
not for generic URIs. The only thing is sure
is that unrelevant spaces should be removed
for the URI RFC.

For me REMOVE ALL would fix the situation,
not sure about the consequences on your problem.

I see also an other "datatype" where white space
handling "REMOVE ALL" could be good: credit card
numbers.

Regards,
Michele



--- Jeff Rafter <lists@jeffrafter.com> ha scritto: 
> I was considering filing a bug in the bugzilla db
> for XPath 2.0 but 
> decided that I am so unsure about the issue that I
> would bring it here 
> for a little discussion (and if that subject line
> can't start a 
> permathread, I don't know what can).
> 
> In XPath 2.0 (CR) many of the namespace properties
> are defined as 
> xs:anyURI with requirements on whitespace
> normalization defined in the 
> XML Schema spec. Now, I am not sure which rules are
> being referred to 
> but I can only guess that they are the whitespace
> normalization rules in 
> the structures spec [2] because there are no rules
> for normalization in 
> the updated wording in the errata for xs:anyURI [3].
> Michael Kay's 
> corrective wording for URILiteral is:
> 
> "The URILiteral is subjected to whitespace
> normalization as defined for 
> the xs:anyURI type in [XML Schema]: this means that
> leading and trailing 
> whitespace is removed, and any other sequence of
> whitespace characters 
> is replaced by a single space (#x20) character.
> Whitespace normalization 
> is done after the expansion of CharRefs, so writing
> a newline (say) as 
> 
 does not prevent its being normalized to a
> space character." [4]
> 
> Now this leads me to my larger question: is
> whitespace normalization 
> allowed for namespace declarations? If not, does
> this ruin their 
> comparability? In Namespaces in XML 1.1 (I am using
> 1.1 because it 
> contains better wording for what was already
> understood in 1.0), it 
> states that namespaces must be compared lexically
> and that the 
> comparison should take place after attribute
> normalization (so CharRefs 
> are expanded) [5]. Because of this, you may end up
> with a 
> single-normalized namespace IRI/URI and a
> double-normalized namespace 
> property in XPath 2.0. Consider the namespace name:
> 
>    xmlns:foo="http://www.example.com/Example with
> two  spaces"
> 
> The namespace name will be viewed as (after
> normalization):
> 
>    http://www.example.com/Example with two  spaces
> 
> While the doubly normalized property value will be
> (after XML Schema 
> whitespace normalization):
> 
>    http://www.example.com/Example with two spaces
> 
> If this is true then lexical comparison will fail.
> Is this accurate?
> 
> 
> [1]
> http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/
>    * Note, XPath 2.0 refers to the REC first edition
> not the SE
> 
> [2] 
>
http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/#section-White-Space-Normalization-during-Validation
> 
> [3] http://www.w3.org/2001/05/xmlschema-errata#e2-11
> 
> [4]
> http://www.w3.org/Bugs/Public/show_bug.cgi?id=2462
> 
> [5] http://www.w3.org/TR/xml-names11/#IRIComparison
> 
> 
> Thanks,
> Jeff Rafter
> 
>
-----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org
> <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> 
> The list archives are at
> http://lists.xml.org/archives/xml-dev/
> 
> To subscribe or unsubscribe from this list use the
> subscription
> manager:
> <http://www.oasis-open.org/mlmanage/index.php>
> 
>  


	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS