OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Attribute-Value Normalization problem

Sounds like a job for Canonical XML, which is designed expressly for digital
signature verification as part of XML Signature.  Canonical XML is needed so
a hash of the document will be identical in the face of transformations XML
1.0 allows that change format but not content.  Similar to infoset, but
addresses special requirements of hash comparison.

See http://www.w3.org/TR/xml-c14n and http://www.w3.org/Signature/

Charles Reitzel

>Date: Fri, 26 Jan 2001 12:19:27 -0500 (EST)
>From: Jianjun Zhang <jiazhang@eecs.umich.edu>
>Subject: Attribute-Value Normalization problem
>I have encountered a problem regarding the Attribute-Value Normalization.
>I have the following XML(as an examle):
><test Attr="&#x09;">some text</test>
>I need to construct a DOM from it and then write it back to a
>file repeatedly. During each cycle, I would generate a DOM-Hash Digest of
>the document and compare the new digest with the digest from the last
>cycle (to make sure that the document is not changed). 
>The Attribute-Value Normalization specification (as in XML Spec 1.0
>Section 3.3.3) treats a Character Reference Differently from other entity
>references (not recursively processed), which gives me much grief. 
>The first time I process the document, &#x09; is replaced by a TAB
>character. After I generate digest, I write it back to a file.  However,
>the second time I process it again from the file I just wrote, the
>TAB character is replaced by a SPACE character. The new digest based on
>this DOM no longer matches the old one, though there is no actual changes
>to the file.
>Is there any easy way (without always process twice before trusting the
>results) to circumvent this? My further question is: Why does the spec
>treats Character References differently? Why can't we also recursively
>normalize Character References?