OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: SAX RFD: ModSAX Predefined Features

[ Lists Home | Date Index | Thread Index ]
  • From: David Brownell <db@eng.sun.com>
  • To: David Megginson <david@megginson.com>
  • Date: Mon, 08 Mar 1999 22:51:18 -0800

Again, I think that unifying these under the generic get/set
API (with Boolean.TRUE and Boolean.FALSE objects as values
for features that are really boolean) could be useful.

Documentation for each feature should specify whether it's
changeable mid-parse ... I'd suggest "no" as the default answer!

Mike Dacon commented about the "API archaeology" aspect of this
name; perhaps the "Parser2" style naming convention can avoid
losing technical context (i.e. this is still a parser, even
if it's parsing a DOM or a stream of SAX events :-).

> 1. http://xml.org/sax/features/validation

Good.  (I'm curious if folks prefer one parser, which can
have this feature toggled, vs two, where the parser comes
with at least an initial value.)

> 2A. http://xml.org/sax/features/external-general-entities
> 2B. http://xml.org/sax/features/external-parameter-entities

Right, two kinds of parsed entities, two control knobs.
Validating parsers must refuse to change these knobs.
(OK, _five_ kind of parser -- validating, and four kinds
of nonvalidating parser!  ;-)

> 3. http://xml.org/sax/features/namespaces

I'd rather have this just kick in modified XML syntax rules
(e.g. entity names may never be scoped, and scoped names may
have only one interior colon).

With that, one can layer the rest of namespace processing
on top in any of several fashions.  A DOM can be built which
exposes namespace declarations; or a filter can munge names
and strip out the declarations.  The "munge" feature could
get its own namespace URI.

> 4. http://xml.org/sax/features/unbuffered-input
>   True means ensure that the parser does not buffer input from a
>   Reader or InputStream supplied by the application (actually,
>   one-character look-ahead will usually be required); false means do
>   not ensure that the parser does not buffer input.  This feature might
>   be useful for reading multiple documents from a single stream.

I'm not sure this is a common enough feature to need to be
predefined ... support for "XML Islands" within HTML may become
important, but much of this can be done (at least in Java) by
requiring pushback to be done at appropriate points.

> http://xml.org/sax/features/normalize-text

This is a good filter feature, I think.

Lars suggested a "Catalog" feature.  There are different sorts of
catalog, and they need configuration, so the value of this could
be a URI for the catalog, not just a boolean.  Plus, this would
seem to be up to the "EntityResolver" to handle ... yes?  It'd
perhaps suggest that one could ask the next filter in the stream
for the resolver it was using ... :-)

Good discussion, gang!

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS