[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] XML schema xs:string and non BMP character like 𐌀, length restriction
- From: "Pete Cordell" <petexmldev@codalogic.com>
- To: "Martin Honnen" <Martin.Honnen@gmx.de>,"xml-dev" <xml-dev@lists.xml.org>
- Date: Fri, 12 Oct 2012 11:25:50 +0100
FWIW, I agree with your assessment.
Pete Cordell
Codalogic Ltd
Twitter: http://twitter.com/petecordell
Interface XML to C++ the easy way using C++ XML
data binding to convert XSD schemas to C++ classes.
Visit http://codalogic.com/lmx/ or http://www.xml2cpp.com
for more info
----- Original Message -----
From: "Martin Honnen" <Martin.Honnen@gmx.de>
To: "xml-dev" <xml-dev@lists.xml.org>
Sent: Friday, October 12, 2012 11:16 AM
Subject: [xml-dev] XML schema xs:string and non BMP character like
𐌀, length restriction
> Hi,
>
> I am seeing inconsistencies between different schema validating parsers
> when it comes to Unicode characters outside of the BMP, like 𐌀 for
> instance, and length restrictions on xs:string.
>
> For the sample http://home.arcor.de/martin.honnen/xml/oneCharInstance1.xml
> which has the contents
>
> <?xml version="1.0" encoding="utf-8" ?>
> <root>
> <test>𐌀</test>
> </root>
>
> the XSV validator and Saxon 9.4 EE don't report any validation errors when
> validading against the schema
> http://home.arcor.de/martin.honnen/xml/oneCharSchema1.xsd (which has as it
> contents
>
> <?xml version="1.0" encoding="utf-8"?>
> <xs:schema attributeFormDefault="unqualified"
> elementFormDefault="qualified"
> xmlns:xs="http://www.w3.org/2001/XMLSchema">
> <xs:element name="root">
> <xs:complexType>
> <xs:sequence>
> <xs:element maxOccurs="unbounded" name="test" type="one-char" />
> </xs:sequence>
> </xs:complexType>
> </xs:element>
> <xs:simpleType name="one-char">
> <xs:restriction base="xs:string">
> <xs:length value="1"/>
> </xs:restriction>
> </xs:simpleType>
> </xs:schema>
>
> ).
>
> However Xerces Java 2.11 reports "[Error] oneCharInstance1.xml:3:25:
> cvc-length-valid: Value '?' with length = '2'
> is not facet-valid with respect to length '1' for type 'one-char'." so it
> seems to consider the contents of the "test" element as a string with two
> characters.
>
> MSXML 6 and .NET's validating parser report similar errors.
>
> In my view Xerces and MSXML and .NET get it wrong as in terms of the XML
> specification and the schema data type 𐌀 is a single XML character
> but I would like confirmation by others on the list before filing bugs.
>
>
>
> --
>
> Martin Honnen --- MVP Data Platform Development
> http://msmvps.com/blogs/martin_honnen/
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]