XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] A dandy little technique for constraining yourstrings to ASCII

Strange, my earlier post seems messed up one way or another.

Sorry if this is a double post

But here is how I would do this
(technically, ignoring the 'why' of the use case)

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="Test">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="Name" type="NameType" />
                <xs:element name="Description" type="DescriptionType" />
            </xs:sequence>
        </xs:complexType>
    </xs:element>
    
    <xs:simpleType name="NameType">
        <xs:restriction base="ascii">
            <xs:maxLength value="10" />
        </xs:restriction>
    </xs:simpleType>
    
    <xs:simpleType name="DescriptionType">
        <xs:restriction base="ascii">
            <xs:maxLength value="20" />
        </xs:restriction>
    </xs:simpleType>
    
    <xs:simpleType name="ascii">
        <xs:restriction base="xs:string">
            <xs:pattern value="[\p{IsBasicLatin}]*" />
        </xs:restriction>
     </xs:simpleType>
    
</xs:schema>


From: "Roger L. Costello" <costello@mitre.org>
To: xml-dev@lists.xml.org
Sent: Wednesday, October 21, 2015 7:07:02 PM
Subject: [xml-dev] A dandy little technique for constraining your strings to ASCII

Hi Folks,

 

So, you’ve created an XML schema. And it contains a lot of elements and attributes of type string.

 

You want each string constrained to just ASCII characters. Use the pattern facet for that.

 

Here’s a dandy little technique you can use:

 

At the top of your schema, place this named entity declaration:


<!DOCTYPE xs:schema [
<!ENTITY ASCII "[\p{IsBasicLatin}]*">
]>

 

The entity ( ASCII ) can then be referenced in each pattern facet:

 

<xs:simpleType name="NameType">
   
<xs:restriction base="xs:string">
       
<xs:maxLength value="10" />
       
<xs:pattern value="&ASCII;" />
   
</xs:restriction>
</xs:simpleType>

<xs:simpleType name="DescriptionType">
   
<xs:restriction base="xs:string">
       
<xs:maxLength value="20" />
       
<xs:pattern value="&ASCII;" />
   
</xs:restriction>
</xs:simpleType>

 

At parse-time the XML parser will substitute each entity reference ( &ASCII; ) with its replacement text ( [\p{IsBasicLatin}]* ).

 

The entity provides useful documentation; i.e., I assert that this:

 

<xs:pattern value="&ASCII;" />

 

is more readable than this:

 

<xs:pattern value="[\p{IsBasicLatin}]*" />

 

Here’s a complete schema to illustrate the technique:

 

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xs:schema [
<!ENTITY ASCII "[\p{IsBasicLatin}]*">
]>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    
    
<xs:element name="Test">
       
<xs:complexType>
           
<xs:sequence>
               
<xs:element name="Name" type="NameType" />
               
<xs:element name="Description" type="DescriptionType" />
           
</xs:sequence>
       
</xs:complexType>
   
</xs:element>
   
    
<xs:simpleType name="NameType">
       
<xs:restriction base="xs:string">
           
<xs:maxLength value="10" />
           
<xs:pattern value="&ASCII;" />
       
</xs:restriction>
   
</xs:simpleType>
   
    
<xs:simpleType name="DescriptionType">
       
<xs:restriction base="xs:string">
           
<xs:maxLength value="20" />
           
<xs:pattern value="&ASCII;" />
       
</xs:restriction>
   
</xs:simpleType>

</xs:schema>

/Roger

 

 




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS