OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: What is an XML Document? [Was: Re: [xml-dev] canonicalization]

[ Lists Home | Date Index | Thread Index ]

At 1:56 PM -0500 3/5/02, Daniel Veillard wrote:

>   Well that sequence of bytes may actually become a set of sequences
>as soon as one is dealing with external entities.

A good point. The way the spec is written though I think it's 
consistent to claim that the document is only the byte/character 
sequence that references the external entities. It does not actually 
include the merged text of the  entities. The spec also states that:

[Definition: A textual object is a well-formed XML document if:]

1. Taken as a whole, it matches the production labeled document.

2. It meets all the well-formedness constraints given in this specification.

3. Each of the parsed entities which is referenced directly or 
indirectly within the document is well-formed.

Point 3 in particular indicates that the entities are not part of the 
document, even though the parser may treat them as if they were.

>   Still the Jabber case is an interesting example in my opinion because
>they stretch the usual principle of keeping instances "atomic" and instead
>agree to work on a long lived "never ending" document. And in such use
>case entities doesn't work (because there isn't even a DOCTYPE at the
>start of the connection), while XInclude does (assuming the parser handle
>them of course), it's intersing to see various specification taken from
>a Jabber view point, a lot of them actually requires a full document
>instance and won't work directly in such a context.

Another good point. However, the BNF grammar and well-formedness 
constraints make it clear that an infinite sequence cannot possibly 
be a well-formed XML document. Thus my definition of data object 
should be revised to say "either a finite sequence of bytes or a 
finite sequence of Unicode characters". I don't know if a Jabber 
document is truly infinite or just indefinitely large. (Looking at 
the spec I think it's just indefinite.)

| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
|          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
|             http://www.cafeconleche.org/books/bible2/              |
|   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS