[
Lists Home |
Date Index |
Thread Index
]
- To: "Perry A. Caro" <caro@adobe.com>,<xml-dev@lists.xml.org>
- Subject: RE: [xml-dev] Ambiguities in section 4.3.2 of XML 1.0 SE
- From: "Derek Denny-Brown" <derekdb@microsoft.com>
- Date: Tue, 12 Aug 2003 09:35:37 -0700
- Thread-index: AcNgZbfADAQVzfmLTPKhbwgQAquS5QAiMfmQ
- Thread-topic: [xml-dev] Ambiguities in section 4.3.2 of XML 1.0 SE
Unfortunately (for Xml Parsers at least, and users somewhat) Xml does
clearly disallow ]]> in _lots_ of contexts, even when it is clearly not
a CDATA context. I know I spent a few days trying to implement that
efficiently.
<rant>
The fact that it requires a 2-character lookahead, and that ']' is way
the other end of the lower 128 characters from all the other special
characters makes it a royal pain. (By that I mean 'switch' statements
tend to be compiled to much less efficient code, just because you added
a case for that specific character. I actually tried it with 2
different compilers and the perf of the switch statement was completely
foobared in both compilers, just by adding case ']'...)
</rant>
-derek
> -----Original Message-----
> From: Perry A. Caro [mailto:caro@adobe.com]
> Sent: Monday, August 11, 2003 5:03 PM
> To: xml-dev@lists.xml.org
> Subject: [xml-dev] Ambiguities in section 4.3.2 of XML 1.0 SE
>
> [I sent the following to xml-editor@w3.org. Am I completely crazy, or
are
> some clarfications called for in the spec? Would you think either of
the
> following examples were not well-formed?
>
> Example: <foo bar="]]>"/>
>
> Example: <!ENTITY cdend="]]>">
> ...
> <foo bar="&cdend;"/>
> ]
>
> With respect to section 4.3.2 of the XML 1.0 Specification Second
Edition
> and by implication XML 1.1 CR, there appear to be several ambiguities
> engendered by the following statement:
>
> An internal general parsed entity is well-formed if its replacement
> text matches the production labeled content.
>
> ... when considered in the context of CDATA Sections and "]]>". For
> example,
> this would imply that the following declaration in an internal DTD
subset
> would result in an internal general parsed entity that is not
well-formed:
>
> <!ENTITY cdend "]]>">
>
> ... because the replacement text does not match the [43] content
> production.
>
> If so ...
>
> 1) This contradicts statements about Literals in section 2.3, namely:
>
> Literal data is any quoted string not containing the quotation mark
> used as a delimiter for that string. Literals are used for
> specifying the content of internal entities (EntityValue),
>
> ... and production [9] EntityValue. Production [9] permits "]]>" as a
> replacement text.
>
> Furthermore, [10] AttValue also permits "]]>". It would be nonsensical
for
> <foo bar="]]>"/> to be well-formed, but not <foo bar="&cdend;"/>,
using
> the
> entity declaration above.
>
> 2) This contradicts the last paragraph of section 4.3.2:
>
> A consequence of well-formedness in entities is that the logical
> and physical structures in an XML document are properly nested; no
> start-tag, end-tag, empty-element tag, element, comment, processing
> instruction, character reference, or entity reference can begin in
> one entity and end in another.
>
> The list appears to be intended to be exhaustive. The lack of "CDATA
> Section" in the list might be interpreted to mean that you can start a
> CDATA
> Section in one entity, and end it in another. Therefore, the
declaration
> of
> &cdend; above should be well-formed.
>
> ===============================
>
> Since the well-formedness of internal general parsed entities is
> completely
> defined by productions [71] GEDecl, [73] EntityDef, and [9]
EntityValue,
> what is the value of the statement in section 4.3.2? What does it
intend
> to
> clarify?
>
> Perry A. Caro
> Adobe Systems Incorporated
>
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>
>
|