OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: Feature Manifest (Was:RE: Parser Behaviour (serious))

[ Lists Home | Date Index | Thread Index ]
  • From: THOMAS PASSIN <tpassin@idsonline.com>
  • To: <xml-dev@xml.org>
  • Date: Sat, 8 Apr 2000 00:41:51 -0400

To help anyone interested to get started on Peter Murray-Rust's proposal
about reviewing the various combinations of parser behavior in the xml rec,
I have pulled out all (I hope!) behaviors that are specified with "may" or
"at user option".  I didn't indicate the section each piece came from, but
that's not hard to find.  The text that relates most closely to the concern
that Peter originally expressed is, I think, the following:

"If the entity is external, and the processor is not attempting to validate
the XML document, the processor may, but need not, include the entity's
replacement text. If a non-validating parser does not include the
replacement text, it must inform the application that it recognized, but did
not read, the entity."

The whole list is a little long for one of these postings, but here it is:

============================================================================
=============
"at user option "
     Conforming software may or must (depending on the modal verb in the
sentence) behave as described; if it does, it must provide users a means to
enable or disable the behavior described.

"may"
     Conforming documents and XML processors are permitted to but need not
behave as described.

"error"
     A violation of the rules of this specification; results are undefined.
Conforming software may detect and report an error and may recover from it.

match
     (Of strings or names:) Two strings or names being compared must be
identical. Characters with
     multiple possible representations in ISO/IEC 10646 (e.g. characters
with both precomposed and
     base+diacritic forms) match only if they have the same representation
in both strings. At user option,
     processors may normalize such characters to some canonical form.

At user option, an XML processor may issue a warning when a declaration
mentions an element type for which no declaration is provided, but this is
not an error.

 At user option, an XML processor may issue a warning if attributes are
declared for an element type not itself declared, but this is not an error.

 For interoperability, an XML processor may at user option issue a warning
when more than one attribute-list declaration is provided for a given
element type, or more than one attribute definition is provided for a given
attribute, but this is not an error.

If the same entity is declared more than once, the first declaration
encountered is binding; at user option, an XML processor may issue a warning
if entities are declared multiple times.

 an XML processor may, but need not, make it possible for an application to
retrieve the text of comments.

 Processors may signal an error if they receive documents labeled with
versions they do not support.

an XML processor may signal an error if a fragment identifier is given as
part of a system identifier.

An XML processor attempting to retrieve the entity's content may use the
public identifier to try to generate an alternative URI. If the processor is
unable to do so, it must use the URI specified in the system literal.

Although an XML processor is required to read only entities in the UTF-8 and
UTF-16 encodings, it is recognized that other encodings are used around the
world, and it may be desired for XML processors to read entities that use
them.

In an encoding declaration, the values "UTF-8", "UTF-16", "ISO-10646-UCS-2",
and "ISO-10646-UCS-4" should be used for the various encodings and
transformations of Unicode / ISO/IEC 10646, the values "ISO-8859-1",
"ISO-8859-2", ... "ISO-8859-9" should be used for the parts of ISO 8859, and
the values "ISO-2022-JP", "Shift_JIS", and "EUC-JP" should be used for the
various encoded forms of JIS X-0208-1997. XML processors may recognize other
encodings; it is recommended that character encodings registered (as
charsets) with the Internet Assigned Numbers Authority [IANA], other than
those just listed, should be referred to using their registered names. Note
that these registered names are defined to be case-insensitive, so
processors wishing to match against them should do so in a case-insensitive
way.

When an XML processor recognizes a reference to a parsed entity, in order to
validate the document, the processor must include its replacement text. If
the entity is external, and the processor is not attempting to validate the
XML document, the processor may, but need not, include the entity's
replacement text. If a non-validating parser does not include the
replacement text, it must inform the application that it recognized, but did
not read, the entity.

XML processors must provide applications with the name and external
identifier(s) of any notation declared and referred to in an attribute
value, attribute definition, or entity declaration. They may additionally
resolve the external identifier into the system identifier, file name, or
other information needed to allow the application to call a processor for
data in the notation described. (It is not an error, however, for XML
documents to declare and refer to notations for which notation-specific
applications are not available on the system where the XML processor or
application is running.)

=========================================================
More excerpts, from Section 5, on conforming processors:

Non-validating processors are required to check only the document entity,
including the entire internal DTD subset, for well-formedness. While they
are not required to check the document for validity, they are required to
process all the declarations they read in the internal DTD subset and in any
parameter entity that they read, up to the first reference to a parameter
entity that they do not read; that is to say, they must use the information
in those declarations to normalize attribute values, include the replacement
text of internal entities, and supply default attribute values. They must
not process entity declarations or attribute-list declarations encountered
after a reference to a parameter entity that is not read, since the entity
may have contained overriding declarations.

5.2 Using XML Processors
The behavior of a validating XML processor is highly predictable; it must
read every piece of a document and report all well-formedness and validity
violations. Less is required of a non-validating processor; it need not read
any part of the document other than the document entity. This has two
effects that may be important to users of XML processors:

Certain well-formedness errors, specifically those that require reading
external entities, may not be detected by a non-validating processor.
Examples include the constraints entitled Entity Declared, Parsed Entity,
and No Recursion, as well as some of the cases described as forbidden in
"4.4 XML Processor Treatment of Entities and References".

The information passed from the processor to the application may vary,
depending on whether the processor reads parameter and external entities.
For example, a non-validating processor may not normalize attribute values,
include the replacement text of internal entities, or supply default
attribute values, where doing so depends on having read declarations in
external or parameter entities.

For maximum reliability in interoperating between different XML processors,
applications which use non-validating processors should not rely on any
behaviors not required of such processors. Applications which require
facilities such as the use of default attributes or internal entities which
are declared in external entities should use validating XML processors.




***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS