Hi Folks,
So, you’ve created an XML schema. And it contains a lot of elements and attributes of type string.
You want each string constrained to just ASCII characters. Use the pattern facet for that.
Here’s a dandy little technique you can use:
At the top of your schema, place this named entity declaration:
<!DOCTYPE xs:schema [
<!ENTITY ASCII "[\p{IsBasicLatin}]*">
]>
The entity ( ASCII ) can then be referenced in each pattern facet:
<xs:simpleType name="NameType">
<xs:restriction base="xs:string">
<xs:maxLength value="10"
/>
<xs:pattern value="&ASCII;"
/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="DescriptionType">
<xs:restriction base="xs:string">
<xs:maxLength value="20"
/>
<xs:pattern value="&ASCII;"
/>
</xs:restriction>
</xs:simpleType>
At parse-time the XML parser will substitute each entity reference ( &ASCII; ) with its replacement text ( [\p{IsBasicLatin}]* ).
The entity provides useful documentation; i.e., I assert that this:
<xs:pattern value="&ASCII;" />
is more readable than this:
<xs:pattern value="[\p{IsBasicLatin}]*" />
Here’s a complete schema to illustrate the technique:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xs:schema [
<!ENTITY ASCII "[\p{IsBasicLatin}]*">
]>
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Test">
<xs:complexType>
<xs:sequence>
<xs:element name="Name"
type="NameType"
/>
<xs:element name="Description"
type="DescriptionType"
/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:simpleType name="NameType">
<xs:restriction base="xs:string">
<xs:maxLength value="10"
/>
<xs:pattern value="&ASCII;"
/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="DescriptionType">
<xs:restriction base="xs:string">
<xs:maxLength value="20"
/>
<xs:pattern value="&ASCII;"
/>
</xs:restriction>
</xs:simpleType>
</xs:schema>
/Roger