[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: XInclude vs SAX vs validation
- From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
- To: John Cowan <jcowan@reutershealth.com>
- Date: Wed, 22 Aug 2001 13:35:23 -0400
At 2:06 PM -0400 8/21/01, John Cowan wrote:
>I think this is a Bad Thing (comments should be discarded with the
>rest of the lexical cruft). In any event, citing the Infoset is
>no justification. Remember that conformance to the Infoset means
>citing what parts of it you use and what parts you don't.
>XInclude is silent on this point, which means that it transmits
>just so much of the Infoset as the underlying infoset creators
>provided.
>
But in this case the underlying infoset is created from an XML document, and according to the infoset spec "There is a comment information item for each XML comment in the original document, except for those appearing in the DTD (which are not represented)."
This doesn't seem to leave a lot of room for interpretation. Each namespace well-formed XML document has one unique infoset. If a parser is randomly or unrandomly throwing away pieces of that set, it may be producing legal infosets and it may be behaving in accord with the Infoset spec, but it is not producing the infoset for the document in question. It is producing some modified infoset.
Furthermore section 4 of the XInclude spec states "The input for the inclusion transformation consists of a source infoset. [Definition: ] The output, called the result infoset, is a new infoset which merges the source infoset with the infosets of resources identified by URI references appearing in xi:include elements." In other words, at least for the included documents (if perhaps not for the including document) it is *not* acceptable to use any old infoset. You have to use the infoset for the identified resource, and this infoset includes comments when comments are present in the included XML document.
>Historically, comments are in the infoset because they are in the
>XPath 1.0 data model, and they are there, IIRC, because some
>(benighted, IMHO) people thought that scripts embedded in HTML
>were comments because they began with "<!--" and ended with "-->".
>
I'm not disagreeing that comments shouldn't have been put in the infoset. However, since they are in the infoset now, they are in other specs (i.e. XInclude) and I think APIs should be expressive enough to allow us to fully implement W3C specs.
--
+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
| The XML Bible, 2nd Edition (Hungry Minds, 2001) |
| http://www.ibiblio.org/xml/books/bible2/ |
| http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/ |
+----------------------------------+---------------------------------+
| Read Cafe au Lait for Java News: http://www.cafeaulait.org/ |
| Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/ |
+----------------------------------+---------------------------------+