Hi Folks,
So, you are using an XSD that someone created. Later, they change the XSD. “How is the new XSD different from the old?” you wonder.
XSDs that are syntactically very similar may induce very different semantics, and vice versa, XSDs that semantically describe the same system may have rather different syntactic representations.
Thus, a list of syntactic differences, although accurate, correct, and complete, may not reveal the real implications these differences have on the correctness and potential use of the XSDs involved. In other words, such a list, although easy to follow, understand,
and manipulate, may not be able to expose and represent the semantic differences between two versions of a schema, in terms of the bugs that were fixed or the features (or new bugs...) that were added.
The following is a neat example of a schema, its updated version, and their semantic differences:
XML Schema for a company: Each employee works multiple tasks. Each task has a start date. Each employee has a manager. Each manager manages multiple employees.
Here is an XML instance that conforms to the schema. John Doe is an employee. He works on a task, start date Jan. 1, 2016. His manager is Sally Smith. Sally Smith manages John Doe.
A design review with a domain expert reveals two bugs in the schema: first, employees should not be assigned more than two tasks, and second, managers are also employees, and they can handle
tasks too.
A new version of the XSD is created. The two versions share the same set of named elements but they are not identical. The new version has added an inheritance relation between Manager and Employee,
and set the multiplicity on the association between Employee and Task to 0..2.
The following XML instance conforms to the updated schema. John Doe is an employee. He works on a task, start date Jan. 1, 2016. His manager is Sally Smith. Sally Smith manages John Doe, and
she is also an employee. She works on a task, start date Jan 1, 2015. Her manager is Big Boss.
The first XML document is a valid instance of both the original schema and the updated schema. The second XML document is a valid instance of the updated schema but not the original schema. This
second XML document serves as a concrete proof for the real change between the two schema versions and its effect on the meaning of the schema involved. The semantics of the updated schema includes the original schema’s semantics. When applied to the version
history of a certain schema, such instances provide a semantic insight into the schema’s evolution.
The above description is an adaptation of the paper titled, “A Manifesto for Semantic Model Differencing” (http://link.springer.com/content/pdf/10.1007%2F978-3-642-21210-9_19.pdf)
Below are the schemas described above.
Original Version
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element
name="Company">
<xs:complexType>
<xs:sequence>
<xs:element
ref="Employee"
maxOccurs="unbounded"
/>
<xs:element
ref="Manager"
maxOccurs="unbounded"
/>
</xs:sequence>
</xs:complexType>
<xs:key
name="employeeKey">
<xs:selector
xpath="Employee"
/>
<xs:field
xpath="name"
/>
</xs:key>
<xs:key
name="managerKey">
<xs:selector
xpath="Manager"
/>
<xs:field
xpath="name"
/>
</xs:key>
<xs:keyref
refer="managerKey"
name="EmployeeManager">
<xs:selector
xpath="Employee"
/>
<xs:field
xpath="managedBy"
/>
</xs:keyref>
<xs:keyref
refer="employeeKey"
name="ManagerEmployee">
<xs:selector
xpath="Manager/manages"
/>
<xs:field
xpath="employee"
/>
</xs:keyref>
</xs:element>
<xs:element
name="Employee">
<xs:complexType>
<xs:sequence>
<xs:element
ref="name"
/>
<xs:element
ref="Task"
minOccurs="0"
maxOccurs="unbounded"
/>
<xs:element
name="managedBy"
minOccurs="0"
type="xs:string"
/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element
name="Manager">
<xs:complexType>
<xs:sequence>
<xs:element
ref="name"
/>
<xs:element
name="manages"
minOccurs="0"
maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element
name="employee"
type="xs:string"
/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element
name="name"
type="xs:string"
/>
<xs:element
name="Task">
<xs:complexType>
<xs:sequence>
<xs:element
name="startDate"
type="xs:date"
/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Updated Version
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element
name="Company">
<xs:complexType>
<xs:sequence>
<xs:element
ref="Employee"
maxOccurs="unbounded"
/>
<xs:element
ref="Manager"
maxOccurs="unbounded"
/>
</xs:sequence>
</xs:complexType>
<xs:key
name="employeeKey">
<xs:selector
xpath="Employee
| Manager"
/>
<xs:field
xpath="name"
/>
</xs:key>
<xs:key
name="managerKey">
<xs:selector
xpath="Manager"
/>
<xs:field
xpath="name"
/>
</xs:key>
<xs:keyref
refer="managerKey"
name="EmployeeManager">
<xs:selector
xpath="Employee"
/>
<xs:field
xpath="managedBy"
/>
</xs:keyref>
<xs:keyref
refer="employeeKey"
name="ManagerEmployee">
<xs:selector
xpath="Manager/manages"
/>
<xs:field
xpath="employee"
/>
</xs:keyref>
</xs:element>
<xs:complexType
name="employee">
<xs:sequence>
<xs:element
ref="name"
/>
<xs:element
ref="Task"
minOccurs="0"
maxOccurs="2"
/>
<xs:element
name="managedBy"
minOccurs="0"
type="xs:string"
/>
</xs:sequence>
</xs:complexType>
<xs:element
name="Employee"
type="employee"
/>
<xs:complexType
name="manager">
<xs:complexContent>
<xs:extension
base="employee">
<xs:sequence>
<xs:element
name="manages"
minOccurs="0"
maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element
name="employee"
type="xs:string"
/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:element
name="Manager"
type="manager"
/>
<xs:element
name="name"
type="xs:string"
/>
<xs:element
name="Task">
<xs:complexType>
<xs:sequence>
<xs:element
name="startDate"
type="xs:date"
/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>