OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: XML Information Set Requirements, W3C Note 18-February-1999

[ Lists Home | Date Index | Thread Index ]
  • From: Paul Prescod <paul@prescod.net>
  • To: xml-dev@ic.ac.uk
  • Date: Sun, 21 Feb 1999 13:20:52 -0600

Bill la Forge wrote:
> From: Paul Prescod <paul@prescod.net>
> >I'm not glum. It just is not the mandate of the infoset group to invent
> >new purposes for XML. The infoset group is exactly like a supreme court
> >interpreting -- but not changing -- the constitution, which in this case
> >is the XML specification. The terminology used in the XML specification is
> >"document". Therefore that should be the terminology used by the infoset
> >people.
> >
> >As far as a "document focus" being limiting: XML's current popularity in
> >all sorts of fields indicates that that is not the case. If you take a
> >database and encode it for transmission over a wire then you have a
> >document. If you encode a message from one computer to another then you
> >also have a document. I don't see this view as in any way limiting XML's
> >problem domain.
> The issue here isn't a matter of inventing new purposes for XML.
> The issue is more a matter of recognizing what is happening.

I don't know what you are referring to in my post above.

The point of my message was that there is no dichotomy between "document
processing" and "stream processing." A document is a stream. You can also
have a stream of documents.

Therefore the terminology of XML does not need to change to support stream
processing. It already supports it! Some of the features of XML could
support streaming better but that isn't the infoset group's job. Even in
thinking of XML 2.0 there is no dichotomy: any feature that makes XML
documents more modular (i.e. local namespaces) also makes streaming easier
and vice versa. They are two sides of the same coin (if you'll excuse the

> API like SAX and SAXON appear to have broad applicability beyond
> applications where everything needs to be read into memory. It isn't just
> streams, but very large documents too. The W3C's DOM and XSL are far too
> expensive (and even unnecessary!) for the majority of XML's applications.

I don't know what this has to do with my post above either, but I'll
respond anyhow.

What is an "XML application" and how does one count them? Would
100,000,000 web pages each count as an "XML application?"

> The real advantage of considering streams is that you are also accommodating
> very large documents as well. This is an important consideration for the majority
> of the XML community as it exists today.

There are too basic paths you can take: figure out how to handle extremely
large documents or figure out how to combine small documents into
hyperdocuments or "uber-documents." My experience is that the latter
requires your systems to be more complex but also allows them to do more
complex things.

Nevertheless I am totally in favor of supporting the former also. (did
something in my post above indicate otherwise?) Stream based document
processing can be simpler than hyperdocument processing and I'm in favor
of simplicity when it can be achieved.

> I think the request here is that the W3C simply give some consideration for
> large document and stream processing. Not doing so could create real
> problems for the entire industry.

I don't believe that the W3C has forgotten about stream processing. One of
the more controversial parts of the XML namespace specification is
intimately tied to stream processing (local namespaces). I think that I
can safely say that when XML was being developed streaming uses were as
high in the minds of the working group as tree-based uses. Stream based
processing has always been more common in the SGML world than tree
processing. Okay then, why are the DOM and XSL tree based? Well, the web
infrastructure favors small documents inherently. Large streams must be
broken up on the server side for performance reasons. Bandwidth, not RAM,
is the limiting factor in Web user interfaces.

 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself

"In general, as syntactic description becomes deeper, what appear to 
be semantic questions fall increasingly within its scope; and it is 
not entirely obvious whether or where one can draw a natural bound 
between grammar and 'logical grammar'."  - Noam Chomsky, 1963

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS