xml-dev - Re: [xml-dev] md5sum / sha1sum for XML?

Re: [xml-dev] md5sum / sha1sum for XML?

[ Lists Home | Date Index | Thread Index ]

To: davep@dpawson.co.uk
Subject: Re: [xml-dev] md5sum / sha1sum for XML?
From: Richard Salz <rsalz@us.ibm.com>
Date: Fri, 14 Jul 2006 00:17:50 -0400
Cc: xml-dev@lists.xml.org
In-reply-to: <1152809419.5850.25.camel@marge>

If you're sending XML from one place to another and it may be 'touched' by 
one-or more XML processors, then you can't use md5sum, because the XML may 
be parsed and rewritten in a semantically equivalent way that is 
syntactically different, causing the MD5 hash to fail.  E.g., attributes 
may be written out in a different order, or empty elements can be written 
as either <foo></foo> or <foo/>.

In order to deal with this, you should canonicalize the XML and then hash 
that bytestream; this will give an identical digest value, even in the 
face of those changes.  For the hashing, use SHA1.  For the 
canonicalization use exclusive c14n, as it is more robust when your XML is 
transported inside other XML (e.g., it becomes the body of a SOAP 
message).  If you are always generating the XML, you might be able to make 
some simplifying assumptions and come up with a simpler c14n mechanism; I 
strongly suggest you avoid the temptation to do that.  If, in fact, you 
use exc-c14n/sha1, you can probably leverage a large pool of bundled 
and/or open source code, because those mechanisms are used in WS-Security 
for generating a digital signature of a SOAP message; in essence you are 
generating a <dsig:Reference> element of a standard XML digital signature, 
as defined by W3C/IETF.

The second question, is how do you "protect" the digest value?  Are you 
concerned about tampering along the way?  How do you currently protect 
your md5sum values?  It may be enough to generate the XML digest and 
send/store it the same way you do your md5sum value.  Or you might need to 
go whole hog and use an XML signature.

Hope this helps.

        /r$

--
SOA Appliances
Application Integration Middleware

Follow-Ups:
- Re: [xml-dev] md5sum / sha1sum for XML?
  - From: Dave Pawson <davep@dpawson.co.uk>

References:
- Re: [xml-dev] md5sum / sha1sum for XML?
  - From: Dave Pawson <davep@dpawson.co.uk>

Prev by Date: RE: The Best Technologies Don't Win
Next by Date: Re: [xml-dev] Copying text from a source, then converting to XML
Previous by thread: Re: [xml-dev] md5sum / sha1sum for XML?
Next by thread: Re: [xml-dev] md5sum / sha1sum for XML?
Index(es):
- Date
- Thread