OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Postel's law, exceptions

[ Lists Home | Date Index | Thread Index ]

Michael Champion wrote:

> That makes XML 1.0 processing sortof a Big Bang that either produces a
> fully formed Infoset, or an error message.

As I've said, this seems to be now the usual understanding of 'infoset'. But
Infoset is a concept unknown to XML 1.0, and parsing is the necessary first
step of *every* instance of XML 1.0 processing. Many things might be built
upon the output of a particular XML 1.0 parse, including a tree, or a graph,
or the abstraction into Infoset form of the data items identified in that
parse. Yet it seems to me that an 'infoset' would not usually be the desired
final product of processing an XML instance, in large part because such an
'infoset' is a terminal output product:  it cannot be passed or pipelined
into any other context because it is utterly specific to the circumstances
in which it is produced.

What is required if the output of particular XML processing is to be passed
to other XML processing is a document, which of course will first be parsed
before any other processing is performed on it in that new environment. It
is this conveyance of XML instances from context to context which is so well
suited to the internetwork topology, and particularly to the
Web-as-we-know-it. Instances are available as entity bodies which we may GET
at a particular URL, process, and then republish as new instances at other
URLs. It seems doubtful that the aggregation of RSS feed items should be
considered a substantially different process.

What is necessary for this process to work reliably is the draconian XML 1.0
parse as the first step in every new process. Whatever 'liberal' and
'conservative' might have meant in Postel's original usage, in the context
of XML instances which we can GET at URLs 'liberal' in what we accept means
that we acknowledge the instance is not likely to be in a form which our
process can use directly. The only form which our process could use directly
would be a very particular data structure. In their own terms, processes
operate only upon specific data structures, and neither a concrete instance
document nor an abstract infoset is what a process can use directly. The
difference is that the process can be designed to be liberal in accepting
numerous schematically differing concrete instances to parse and then to
instantiate the output of the parse as the particular data structure which
the process requires. That liberality cannot be extended to some sort of
'infoset' as input, because input which is not a parseable document must
either correspond perfectly to a very specific and typed schematic or be
useless to a particular process, and that is the most illiberal of demands.


Walter Perry


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS