XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] XML schema xs:string and non BMP character like 𐌀, length restriction

FWIW, I agree with your assessment.

Pete Cordell
Codalogic Ltd
Twitter: http://twitter.com/petecordell
Interface XML to C++ the easy way using C++ XML
data binding to convert XSD schemas to C++ classes.
Visit http://codalogic.com/lmx/ or http://www.xml2cpp.com
for more info
----- Original Message ----- 
From: "Martin Honnen" <Martin.Honnen@gmx.de>
To: "xml-dev" <xml-dev@lists.xml.org>
Sent: Friday, October 12, 2012 11:16 AM
Subject: [xml-dev] XML schema xs:string and non BMP character like 
&#x10300;, length restriction


> Hi,
>
> I am seeing inconsistencies between different schema validating parsers 
> when it comes to Unicode characters outside of the BMP, like &#x10300; for 
> instance, and length restrictions on xs:string.
>
> For the sample http://home.arcor.de/martin.honnen/xml/oneCharInstance1.xml 
> which has the contents
>
> <?xml version="1.0" encoding="utf-8" ?>
> <root>
>   <test>&#x10300;</test>
> </root>
>
> the XSV validator and Saxon 9.4 EE don't report any validation errors when 
> validading against the schema 
> http://home.arcor.de/martin.honnen/xml/oneCharSchema1.xsd (which has as it 
> contents
>
> <?xml version="1.0" encoding="utf-8"?>
> <xs:schema attributeFormDefault="unqualified" 
> elementFormDefault="qualified" 
> xmlns:xs="http://www.w3.org/2001/XMLSchema";>
>   <xs:element name="root">
>     <xs:complexType>
>       <xs:sequence>
>         <xs:element maxOccurs="unbounded" name="test" type="one-char" />
>       </xs:sequence>
>     </xs:complexType>
>   </xs:element>
>   <xs:simpleType name="one-char">
>     <xs:restriction base="xs:string">
>       <xs:length value="1"/>
>     </xs:restriction>
>   </xs:simpleType>
> </xs:schema>
>
> ).
>
> However Xerces Java 2.11 reports "[Error] oneCharInstance1.xml:3:25: 
> cvc-length-valid: Value '?' with length = '2'
>  is not facet-valid with respect to length '1' for type 'one-char'." so it 
> seems to consider the contents of the "test" element as a string with two 
> characters.
>
> MSXML 6 and .NET's validating parser report similar errors.
>
> In my view Xerces and MSXML and .NET get it wrong as in terms of the XML 
> specification and the schema data type &#x10300; is a single XML character 
> but I would like confirmation by others on the list before filing bugs.
>
>
>
> -- 
>
> Martin Honnen --- MVP Data Platform Development
> http://msmvps.com/blogs/martin_honnen/
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
> 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS