XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] XML Schema 1.1 xpath 2.0 regex question

Neither the XSD nor the XPath regex syntax permits \x. If Xerces accepts it, then it's a non-conformant extension. You'll probably find it works in Saxon if you use the "j" flag, which is also a non-conformant extension - it switches from using the XSD regex syntax to the Java regex syntax.

The conformant way to write this in XSD is ` ` (but don't use this with the -x flag)

Michael Kay
Saxonica

On 17 Dec 2021, at 11:02, Mukul Gandhi <mukulg@softwarebytes.org> wrote:

Hi all,
   I've another question on the same topic, as follows.

I've following XML instance document,

<?xml version="1.0"?>
<X>
  <a>hello   world</a>
</X>

And the following XML Schema 1.1 document,

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="X">
       <xs:complexType>
          <xs:sequence>
             <xs:element name="a" type="xs:string"/>
          </xs:sequence>
          <xs:assert test="matches(a, 'hello[ ]+world')"/>
          <xs:assert test="matches(a, 'hello\x{0020}+world')"/>
       </xs:complexType>
    </xs:element>

</xs:schema>

(the XSD validation requirement is, XML instance string value of element "a" must be word 'hello' followed by one or more space characters and then the word 'world')

The intent of both xs:assert's is same (it's just that, the second xs:assert refers the space character by a unicode code point hex notation as per java's regex convention. the first xs:assert specifies the space character as a literal).

Apache Xerces, doesn't have problems with both the xs:asserts and reports the XML instance document as valid. Where as, Saxon says that second xs:assert has a regex syntax error (it says, "Syntax error at char 7 in regular expression: Escape character 'x' not allowed").

With respect to the XSD validation example provided above, any thoughts, with respect to XML validation correctness, and what the relevant specs say about compliance?

Is it also fine, that Xerces can say as implementation defined feature, "we support specifying characters within XSD 1.1 regex expressions with unicode code point hex notation (\x{...}) ?

I'm also curious to know, does Saxon supports specifying characters within XSD 1.1 regex expressions with unicode code point notation? 


--
Regards,
Mukul Gandhi



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS