[
Lists Home |
Date Index |
Thread Index
]
> First question: Are you prepared to put pullax into the public domain? If
> so it may be a good starting point for the pull-API that John is
> suggesting.
I have no desire to retain any intellectual property rights to pullax.
However, it is not entirely clear to me that the public domain is the best
place for a "standard" API. It might be better to assign the copyright to a
suitable organization (e.g. OASIS, Sun) to ensure appropriate change
control. However, if the consensus is that public domain is best, I am
happy to go along with that. Currently I'm using the BSD copyright, which
is about the most liberal open source license. In my experience, the BSD
license is generally more acceptable than public domain. Some
organizations find the absence of a copyright holder (as is the case with
public domain code) unsettling; public domain also provides no protection
from removal of no warranty information or from use of the author's
organization in advertising.
> Second, I followed your invitation to look at the JavaDoc on your web site
> [1]. I haven't had a chance to look at it in detail, but one of the first
> things I noticed was the XmlInputSource class [2]. I don't understand
> what this class is supposed to achieve as it doesn't have any methods. I
> see it has three sub-classes, each one offering a subset of a SAX
> InputSource - but without a common interface how can a parser use a
> XmlInputSource? Unless I'm mistaken it will have to cast it into one of
> the known sub-classes, which isn't very attractive. Also, why is it a
> class rather than an interface?
Let me explain what I'm trying to achieve. I want to allow the input to
the scanner to be specified in one of three ways:
1) in the same way as in an entity declaration, using a system identifier,
a base URI to be used to resolve the system identifier into an absolute URI
if the system identifier is relative, and, optionally, a public identifier
(this corresponds to the XmlExternalId class)
2) as an InputStream, optionally with a URI specifying the URI from which
the InputStream was retrieved which will serve as the base URI of relative
URIs occurring in that entity (in the absence of xml:base attributes), and
optionally with an encoding to be used (if missing, the encoding is
autodetected) (this corresponds to the XmlInputStreamInputSource class)
3) as a Reader, optionally with a URI specifying the URI from which the
Reader was retrieved which will serve as the base URI of relative URIs
occurring in that entity (in the absence of xml:base attributes), and
optionally with an encoding to supply the [character encoding scheme]
infoset property of the document info item (this corresponds to the
XmlReaderInputSource class)
Also I want the entity resolver to be able to return one of these three
kinds of thing.
I don't like combining these three different kinds of thing into one class,
because I think it makes the semantics less crisp.
The key to the current design is that XmlInputSource has a constructor with
package level access. This means that the only possible classes that can
derive from XmlInputSource are those in the com.thaiopensource.pullax
package, specifically XmlExternalId, XmlInputStreamInputSource and
XmlReaderInputSource. Thus using the type XmlInputSource is just a trick
to get the union of these three classes; such a type is necessary for
expressing the return value of XmlEntityResolver.resolve. When a parser
implementation receives an XmlInputSource it will have to check which of
these three classes it is an instance of, and cast it accordingly.
Admittedly this is a bit icky; I prefer APIs not to require clients to
cast. In mitigation I would plead that it's the parser implementor rather
that the parser user who is being forced to cast. One possibility would be
to add
XmlExternalId toExternalId();
XmlInputStreamInputSource toInputStreamInputSource();
XmlReaderInputSource toReaderInputSource();
methods to XmlInputSource where exactly one of these would return a
non-null value. For a C++ interface I would certainly prefer to have those
methods rather than rely on RTTI, but for Java it's not clear to me that
it's worth adding them.
James
|