xml-dev - Let the publisher validate the xml and the make a msg digest

Let the publisher validate the xml and the make a msg digest

[ Lists Home | Date Index | Thread Index ]

To: xml-dev@lists.xml.org
Subject: Let the publisher validate the xml and the make a msg digest
From: Niels Peter Strandberg <nielspeter@npstrandberg.com>
Date: Mon, 10 Mar 2003 19:14:50 +0100

Let the publisher validate the xml and the make a msg digest

When an xml document is authored, the author can attach a xml schema or 
dtd reference  to it. The  receiver of the xml document gets the xml 
document and validates it against the xml schema or dtd, referenced in 
the document to verify that the document is valid.

The xml document might be used over and over again, without any changes 
is made to it, and it might even be validated every time. This is a 
waste of time!

Let the author do the validation of the finished xml document. If the 
xml document is successfully validated against the referenced xml 
schema or dtd, why should the receiver of the document need to check 
the document again to se if it is valid, the author has tested it 
already?

My suggestion is that after the document has been validated by the 
author, an message digest is created, similar to ones used in 
cryptography, and the digest value is appended to the xml document.

All the receiver has to do is run the xml document through the same 
msg. digest, and compare the results of the 2. If they are equal, 
nothing in the document has changed since the author made the digest, 
so no need to validate.

So this brings you not only conformation that the document is valid, 
but also that its content has not changed.

This also allows dom builders (if they are changed) to skip the process 
of verifying that the data it receives from the sax reader is really a 
xml character, well-formed etc, since that also brings a lot of 
overhead. Just look at jdom when it builds a jdom document.

Example:
           <?xml version="1.0"?>
           <Family>
                     <Person>
                               <Name>Fred Flintstone</Name>
                     </Person>
                     <Person>
                               <Name>Vilma Flintstone</Name>
                     </Person>
           </Family>

When I run this through openssl and makes a message digest, with the 
command:  "openssl dgst flintstone.xml"
it returns a digest: "b99060bb744edd6aac5193da6957afcb" (the problem 
with this digest is that white space is also included!)

Then we can do something like this:

           <?xml version="1.0"?>
           <?digest="b99060bb744edd6aac5193da6957afcb"?> // or 
whatever!!!!
           <Family>
                     <Person>
                               <Name>Fred Flintstone</Name>
                     </Person>
                     <Person>
                               <Name>Vilma Flintstone</Name>
                     </Person>
           </Family>

The receiver can then read and remove the digest, and the verify it 
using the same msg digest using the same command showed before.

It could be interesting to do some benchmarking on this.

This is just some thoughts!


Regards, Niels Peter Strandberg

Prev by Date: RE: [xml-dev] Arbitrary grouping
Next by Date: RE: [xml-dev] Registered namespace prefixes
Previous by thread: Help!!
Next by thread: Defining an XML Fragment in W3C Schema
Index(es):
- Date
- Thread