xml-dev - Re: [xml-dev] DESIGN PROPOSAL: Java XMLIterator

Re: [xml-dev] DESIGN PROPOSAL: Java XMLIterator

[ Lists Home | Date Index | Thread Index ]

To: Rob Lugt <roblugt@elcel.com>, John Cowan <jcowan@reutershealth.com>, xml-dev@lists.xml.org
Subject: Re: [xml-dev] DESIGN PROPOSAL: Java XMLIterator
From: James Clark <jjc@jclark.com>
Date: Tue, 18 Dec 2001 19:01:41 +0700


> First question: Are you prepared to put pullax into the public domain?  If
> so it may be a good starting point for the pull-API that John is
> suggesting.

I have no desire to retain any intellectual property rights to pullax. 
However, it is not entirely clear to me that the public domain is the best 
place for a "standard" API. It might be better to assign the copyright to a 
suitable organization (e.g. OASIS, Sun) to ensure appropriate change 
control.  However, if the consensus is that public domain is best, I am 
happy to go along with that.  Currently I'm using the BSD copyright, which 
is about the most liberal open source license.  In my experience, the BSD 
license is generally more acceptable than public domain.  Some 
organizations find the absence of a copyright holder (as is the case with 
public domain code) unsettling; public domain also provides no protection 
from removal of no warranty information or from use of the author's 
organization in advertising.

> Second, I followed your invitation to look at the JavaDoc on your web site
> [1].  I haven't had a chance to look at it in detail, but one of the first
> things I noticed was the XmlInputSource class [2].  I don't understand
> what this class is supposed to achieve as it doesn't have any methods.  I
> see it has three sub-classes, each one offering a subset of a SAX
> InputSource - but without a common interface how can a parser use a
> XmlInputSource?  Unless I'm mistaken it will have to cast it into one of
> the known sub-classes, which isn't very attractive.  Also, why is it a
> class rather than an interface?

Let me explain what I'm trying to achieve.  I want to allow the input to 
the scanner to be specified in one of three ways:

1) in the same way as in an entity declaration, using a system identifier, 
a base URI to be used to resolve the system identifier into an absolute URI 
if the system identifier is relative, and, optionally, a public identifier 
(this corresponds to the XmlExternalId class)

2) as an InputStream, optionally with a URI specifying the URI from which 
the InputStream was retrieved which will serve as the base URI of relative 
URIs occurring in that entity (in the absence of xml:base attributes), and 
optionally with an encoding to be used (if missing, the encoding is 
autodetected) (this corresponds to the XmlInputStreamInputSource class)

3) as a Reader, optionally with a URI specifying the URI from which the 
Reader was retrieved which will serve as the base URI of relative URIs 
occurring in that entity (in the absence of xml:base attributes), and 
optionally with an encoding to supply the [character encoding scheme] 
infoset property of the document info item (this corresponds to the 
XmlReaderInputSource class)

Also I want the entity resolver to be able to return one of these three 
kinds of thing.

I don't like combining these three different kinds of thing into one class, 
because I think it makes the semantics less crisp.

The key to the current design is that XmlInputSource has a constructor with 
package level access.  This means that the only possible classes that can 
derive from XmlInputSource are those in the com.thaiopensource.pullax 
package, specifically XmlExternalId, XmlInputStreamInputSource and 
XmlReaderInputSource.  Thus using the type XmlInputSource is just a trick 
to get the union of these three classes; such a type is necessary for 
expressing the return value of XmlEntityResolver.resolve.  When a parser 
implementation receives an XmlInputSource it will have to check which of 
these three classes it is an instance of, and cast it accordingly. 
Admittedly this is a bit icky; I prefer APIs not to require clients to 
cast.  In mitigation I would plead that it's the parser implementor rather 
that the parser user who is being forced to cast.  One possibility would be 
to add

  XmlExternalId toExternalId();
  XmlInputStreamInputSource toInputStreamInputSource();
  XmlReaderInputSource toReaderInputSource();

methods to XmlInputSource where exactly one of these would return a 
non-null value.  For a C++ interface I would certainly prefer to have those 
methods rather than rely on RTTI, but for Java it's not clear to me that 
it's worth adding them.

James

Follow-Ups:
- Re: pullax licence and XmlInputSource [was DESIGN PROPOSAL: Java XMLIterator]
  - From: "Rob Lugt" <roblugt@elcel.com>

Prev by Date: RE: [xml-dev] DTD problem
Next by Date: Re: [xml-dev] DESIGN PROPOSAL: Java XMLIterator
Previous by thread: RE: [xml-dev] DESIGN PROPOSAL: Java XMLIterator
Next by thread: Re: pullax licence and XmlInputSource [was DESIGN PROPOSAL: Java XMLIterator]
Index(es):
- Date
- Thread