OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE : [xml-dev] Comparison of Xml documents

[ Lists Home | Date Index | Thread Index ]

The original problem is the comparison of two XML documents of the same
class defined by a schema.
The question is : could we compare two documents by using the schema
defining the class of documents.

Especially, knowing that two instances documents are of the same class, that
their elements are the same, except that the order is not the same, is the
schema give us enough infos to determine wether the instances documents are
the same or not.

It is why I evoked three kind of ordering :

1) order in a document according to the tags name (without specifying for
tags with the same name)
2) order in a document according to a feature of tags with the same name
Those two informations contribute in the definition of a class of documents
(but is it enough to compare the documents on the order criteria?).

3) order in the instances themselves. I explain :
We have two tags that containing only one type of sub-tags with no order
specification of type 1 or 2 in the schema, and those two tags contain the
same sub-tags (ordered in the same way or not), how to specify a comparison
information that permits to determine wether the tags are the same or not.
Example :
  A: <e1><e2 att="1"/><e2 att="2"/></e1>
  B: <e1><e2 att="2"/><e2 att="1"/></e1>
Tags e2 are not ordered because order info in the schema is "not ordered",
but are they the same in this case?
The one who design the class of documents should know.
It depends on a semantic feature that is not possible to give in a schema
with the different specs existing (XSD, relaxNG...) :
as I know there is no schema spec allowing this (no "comparison" spec?).

Conclusion:
When defining a class of XML documents, it should be nice if we could give
"comparison" infos (two instances of the class are the same?) in association
with "validation" infos (is the instance in the class?) that are, sometimes,
not sufficient for the comparison.

-----Message d'origine-----
De : Bob Wyman [mailto:bob@wyman.us] 
Envoyé : jeudi 20 novembre 2003 18:48
À : 'Alaric B Snell'; GARNIER Pierre
Cc : xml-dev@lists.xml.org
Objet : RE: [xml-dev] Comparison of Xml documents


Alaric B Snell wrote:
> in ASN.1 you can specify if order is important within a
> compound object - if order matters it's a SEQUENCE, 
> otherwise it's a SET :-)
    Alaric, you may be overstating the case for ASN.1 here. There are two
kinds of ordering discussed in Pierre Garnier's message. I think ASN.1 only
handles one of them which is clearly the one you mentioned, however, some
folk might think you also meant the other.
    In ASN.1 you can specify that things are either a SEQUENCE or a SET.
Elements within a SEQUENCE are ordered while elements within a SET are not.
Thus, given:

    ordered ::= SEQUENCE {
       part1 INTEGER,
       part2 INTEGER
    }

   Your XML encoding could be:
    <ordered>
      <part1>1111</part1>
      <part2>2222</part2>
    </ordered>

   But, the following would be illegal since part2 comes before part1:
    <ordered>
      <part2>2222</part2>
      <part1>1111</part1>
    </ordered>

    If you want to be able to provide part1 and part2 in either order, you
would define a SET instead. You would write:
    ordered ::= SET {
       part1 INTEGER,
       part2 INTEGER
    }

	But this is only one kind of ordering mentioned in Pierre's message.
This is ordering of the structural elements of a message. I think he also
wants to have elements ordered according to their values. Thus, if you
allowed multiple "part1"s to appear in a SEQUENCE or SET, the question is
which should come first? Pierre seems to indicate that he wants the order
determined by the value of instances of the elements -- not by their types.
i.e. he wants to permit:
    <ordered>
      <part>111</part>
      <part>222</part>
    </ordered>

   while prohibiting:
    <ordered>
      <part>222</part>
      <part>111</part>
    </ordered>

    I don't think ASN.1 supports this kind of constraint in the standard
syntax. Or, have I missed something in the spec?

		bob wyman





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS