XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] The <any/> element: bane of security or savior of versioning?

Hi Folks,
 
Below is an approach for creating schemas that are backward and forward
compatible without using the <any/> element.  The key to this approach
is using Schematron to validate extensions. 

First I describe the approach, then I list its advantages and
disadvantages, and then I solicit your thoughts on this approach. 

CREATING BACKWARD-FORWARD COMPATIBLE SCHEMAS WITHOUT USING THE <any/>
ELEMENT
 
The approach will be demonstrated using a Book example.  I will show
three versions of the Book schema, each version an extension of the
previous version.
 
The version #1 Book schema creates an optional <Element> element into
which future extensions can be placed:

<element name="Book">
    <complexType>
        <sequence>
            <element name="Title" type="string"/>
            <element name="Author" type="string"/>
            <element name="Date" type="date"/>
            <element name="ISBN" type="string"/>
            <element name="Publisher" type="string"/>
            <element name="Element" minOccurs="0"
maxOccurs="unbounded">
                <complexType>
                    <sequence>
                        <element name="Name" type="string"/>
                        <element name="Value" type="string"/>
                        <element name="Datatype" type="string"/>
                    </sequence>
                </complexType>
            </element>
        </sequence>
    </complexType>
</element> 

The contents of Book is: Title, Author, Date, ISBN, Publisher and an
optional Element.

Here's a sample XML instance:

<Book>
    <Title>My Life and Times</Title>
    <Author>Paul McCartney</Author>
    <Date>1998</Date>
    <ISBN>1-56592-235-2</ISBN>
    <Publisher>McMillan Publishing</Publisher>
</Book>

... Time elapses. It is decided to update the Book schema. In addition
to providing the title, author, date of publication, isbn, and
publisher information, we also want XML instance to contain information
about the number of pages in the book.  The first (extension) <Element>
will hold the NumPages information.  A Schematron rule is used to
validate that this is the case: 

<element name="Book">
    <complexType>
        <sequence>
            <element name="Title" type="string"/>
            <element name="Author" type="string"/>
            <element name="Date" type="date"/>
            <element name="ISBN" type="string"/>
            <element name="Publisher" type="string"/>
            <element name="Element" minOccurs="0"
maxOccurs="unbounded">
                <complexType>
                    <sequence>
                        <element name="Name" type="string"/>
                        <element name="Value" type="string"/>
                        <element name="Datatype" type="string"/>
                    </sequence>
                </complexType>
            </element>
        </sequence>
    </complexType>
</element> 
<annotation>
    <appinfo>
        <sch:pattern name="Book Extensions">
            <sch:rule context="bk:Book/bk:Element[1]">
                <sch:assert test="bk:Name='NumPages' and
bk:Datatype='nonNegativeInteger'">
                    NumPages is the first extension information item
                </sch:assert>
            </sch:rule>
        </sch:pattern>
    </appinfo>
</annotation>

Here's a sample XML instance:

<Book>
    <Title>My Life and Times</Title>
    <Author>Paul McCartney</Author>
    <Date>1998</Date>
    <ISBN>1-56592-235-2</ISBN>
    <Publisher>McMillan Publishing</Publisher>
    <Element>
        <Name>NumPages</Name>
        <Value>345</Value>
        <Datatype>nonNegativeInteger</Datatype>
    </Element>
</Book>

This instance will validate against the version #1 schema as well as
the version #2 schema.

Further, the version #1 instance shown above will validate against this
new schema.

... More time elapses. It is decided to update the Book schema again.
We want XML instances to also provide an indication of whether the Book
is hardcover. The second <Element> will hold the Hardcover information.
A second Schematron rule is added to validate that this is the case: 

<element name="Book">
    <complexType>
        <sequence>
            <element name="Title" type="string"/>
            <element name="Author" type="string"/>
            <element name="Date" type="date"/>
            <element name="ISBN" type="string"/>
            <element name="Publisher" type="string"/>
            <element name="Element" minOccurs="0"
maxOccurs="unbounded">
                <complexType>
                    <sequence>
                        <element name="Name" type="string"/>
                        <element name="Value" type="string"/>
                        <element name="Datatype" type="string"/>
                    </sequence>
                </complexType>
            </element>
        </sequence>
    </complexType>
</element> 
<annotation>
    <appinfo>
        <sch:pattern name="Book Extensions">
            <sch:rule context="bk:Book/bk:Element[1]">
                <sch:assert test="bk:Name='NumPages' and
bk:Datatype='nonNegativeInteger'">
                    NumPages is the first extension information item
                </sch:assert>
            </sch:rule>
            <sch:rule context="bk:Book/bk:Element[2]">
                <sch:assert test="bk:Name='Hardcover' and
bk:Datatype='boolean'">
                    Hardcover is the second extension information item
                </sch:assert>
            </sch:rule>
        </sch:pattern>
    </appinfo>
</annotation>

Now the contents of Book is: Title, Author, Date, ISBN, Publisher, and
the first Element contains information about the NumPages, the second
Element contains information about whether it is a Hardcover book.

Here's a sample XML instance:

<Book>
    <Title>My Life and Times</Title>
    <Author>Paul McCartney</Author>
    <Date>1998</Date>
    <ISBN>1-56592-235-2</ISBN>
    <Publisher>McMillan Publishing</Publisher>
    <Element>
        <Name>NumPages</Name>
        <Value>345</Value>
        <Datatype>nonNegativeInteger</Datatype>
    </Element>
    <Element>
        <Name>Hardcover</Name>
        <Value>true</Value>
        <Datatype>boolean</Datatype>
    </Element>
</Book>

This instance will validate against the version #1 schema as well as
the version #2 schema as well as the version #3 schema.

In fact, all instances will validate against all schemas. There is
backward and forward compatibility among all schema versions!

NOTES:

1. I embedded the Schematron stuff within the XML Schema document.
Alternatively, I could put the Schematron stuff in a separate document.

2. I specified the datatype of the extension elements in a <Datatype>
element.  Alternatively, I could use xsi:type, e.g.

    <Value xsi:type="xs:nonNegativeInteger">345</Value>

ADVANTAGES OF THIS APPROACH

Compare the <any/> element to achieving backward-forward compatibility
versus the approach described above:

(a) The <any/> element permits any string or any child element, which
can contain anything.

(b) The approach described above constrains extensions just to the
<Element> element.  In the above example I allowed an unbounded number
of <Element> occurrences, but I could easily have put a limit on the
number of extensions by specifying a numeric value for maxOccurs.
Also, in the above example I allowed <Name>, <Value>, and <Datatype> to
hold any string, but I could easily constrain each of them as well.
Thus the extensibility is easily controlled.

Assertion: the approach being described in this message represents a
more controlled, safer approach to achieving backward-forward
compatible schemas than a strategy which uses the <any/> element.

Thus, the approach being described in this message allows the creation
of backward-forward compatible schemas that are also safe.

DISADVANTAGES OF THIS APPROACH

The approach depends of the use of both XML Schemas and Schematron to
express the constraints.  Thus a person who wishes to use this approach
must be fluent with both schema languages.  Also, it means that two
tools are needed to validate XML instances - an XML Schema validator
and a Schematron validator.

The extensions appear a bit "different".  Rather than the XML instances
appearing as

    <NumPages>345</NumPage>

they appear as

    <Element>
        <Name>NumPages</Name>
        <Value>345</Value>
        <Datatype>nonNegativeInteger</Datatype>
    </Element>

The value of the <Name> element is the element name, the value of the
<Value> element is the element value, and the value of the <Datatype>
element is the element's datatype.

I believe that this approach limits extensions to only simple values.

Rick Jelliffe: when is Schematron going to have the ability to do
datatype assertions, e.g. "The value of the <Value> element is of
datatype xs:nonNegativeInteger"?

QUESTION

What other advantages and disadvantages do you see for the above
approach?

/Roger


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS