OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: XML Torture Test: Parsers Fail

[ Lists Home | Date Index | Thread Index ]
  • From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
  • To: xml-dev@ic.ac.uk
  • Date: Wed, 7 Apr 1999 10:07:46 -0400

>I'm not so sure that IE5 is wrong in reporting an error (when unreferenced
>General Entities are DTD chunks).  The XML REC says (in 4.3.2 "Well-Formed
>Parsed Entities")
>"An external general parsed entity is well-formed if it matches the
>production labeled extParsedEnt", which is an optional TextDecl [77]
>followed by 'content' [43].  Non-validating processors are not required to
>read external entities, but they are not forbidden to read them if they are
>not referenced.

I agree that IE5 can read the external entity if it feels like. However,
the document is still well-formed because the entity is never referenced
and is not part of the document. This document meets the criterion for
well-formedness in Section 2.1; i.e.

1. Taken as a whole, it matches the production labeled document.

2. It meets all the well-formedness constraints.

3. Each of the parsed entities which is referenced directly or indirectly
within the document is well-formed.

#3 is the kicker here. The non-well-formed entity that causes the problem
is never referenced.  I'm not sure what indirectly referenced means though.
Perhaps that provides some wiggle room. The only other releavnt instance of
"indirect" I see in the spec is in the No Recursion well-formedness
constraint in Section 4.1. This states that "A parsed entity must not
contain a recursive reference to
itself, either directly or indirectly"

In this context an indirect reference seems to mean one that did not occur
in the main document but that appears in one of the other external parsed
entities that was included by a different entity reference.The annotated
spec seems to support this interpretation
<http://www.xml.com/axml/notes/Recursion.html> though the example given
uses purely internal entities.

The word "indirect" also appears in these well-formedness constraints:

Well-Formedness Constraint: No External Entity References
 Attribute values cannot contain direct or indirect entity references to
external entities.

Well-Formedness Constraint: No < in Attribute Values
 The replacement text of any entity referred to directly or indirectly in
an attribute value (other than "&lt;") must not contain a <.

The annotated spec doesn't really address these two constraints in this way.
It seems remotely possible that what's really meant is an unparsed entity,
but if that's so why didn't the authors just say that?  Furthermore, an
unparsed entity has no reason not to contain these things. Again it seems
that what is mean is simply an entity reference whose value uses another
entity reference that violates the constraint.

In short, I think IE5 is definitely incorrect in not accepting a
declaration of a malformed entity in the absence of an actual reference to
that entity.

| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
|        XML: Extensible Markup Language (IDG Books 1998)            |
|   http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/   |
|  Read Cafe au Lait for Java News:  http://sunsite.unc.edu/javafaq/ |
|  Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/     |

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS