OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] canonicalization

[ Lists Home | Date Index | Thread Index ]

At 12:49 PM -0500 3/4/02, Simon St.Laurent wrote:

>I'm sorry, Elliotte, but I think we've reached the basis of our
>confusion.  I see documents as actually having content, not as a
>framework for performing potentially randomly-sequenced infoset
>transformations in order to figure out what the content might/should be.

If the infoset terminology is throwing you, I can go back to documents.

>The old distinction between logical and entity models made it possible
>to distinguish the content of a document (entity model) from its content
>- the logical model - while still permitting some shortcuts on the
>entity side. 

Try not to confuse entity processing with XIncludes. They are two 
different things. XInclude does not separate the logical and physical 
views of a document. XInclude does not define new logical or physical 
models for existing documents. All it does is define a process that 
goes from one document to another document.

>You appear to be defining XML processing as a set of algorithms
>performed upon infosets, each of which is a unique abstraction.  While I
>don't mind work at that level in cases where we already know what the
>document contains, I have to oppose it as a technique for determining
>the actual contents of the document.

I agree. This is not a technique for determining the the actual 
contents of the document. These processes can only be applied *after* 
you know the actual contents of a document. It is a technique for 
going from one actual document to another actual document. Each 
document is complete unto itself. XInclude does not change the actual 
contents of a document. The key sentence in the XInclude spec is, 
"Processing of external entities (as with the rest of DTDs) occurs at 
parse time. XInclude operates on information sets and thus is 
orthogonal to parsing." Before you can get an infoset, you have to 
parse. The parser does not and should not resolve XIncludes. This is 
a separate operation.

>In ways I find important, XML works because I can say "show me the
>bytes" and actually do something with the bytes, not with potentially
>unknown understanding of what infoset those bytes are really talking
>about.  It used to be that in a worst case I case say "show me the
>canonicalized bytes", but XInclude appears to take away that option.

XInclude does not take away that option. It does not change the bytes 
in an XML document. It allows you to produce a new sequence of bytes, 
by applying certain rules, just as XSLT or a SAX filter does. But you 
can still say show me the bytes of the original document, in which 
case some of those bytes may spell out < x i n c l u d e : i n c l u 
d e  h r e f = ...

Consider this very simple document with a single empty root element:

<xi:include href="http://www.example.com/something.xml"; 

The infoset for this document contains a single xi:include element 
information item, regardless of what's at the URL 
Changing the document at http://www.example.com/something.xml does 
not change this document's infoset. It does not change this 
document's bytes. It does not change this document's canonical form.


| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
|          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
|             http://www.cafeconleche.org/books/bible2/              |
|   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS