OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: Announcement: SAX Java Implementation (pre-release)

[ Lists Home | Date Index | Thread Index ]
  • From: David Megginson <ak117@freenet.carleton.ca>
  • To: xml-dev Mailing List <xml-dev@ic.ac.uk>
  • Date: Mon, 13 Apr 1998 10:05:36 -0400

Tim Bray writes:

 > At 05:31 PM 12/04/98 -0400, David Megginson wrote:
 > > > 1. Why has a SAX prefix been added to all classes?
 > >
 > >There are a few benefits to this decision:
 > Kind of unconvincing, I'd have to say.  If someone doesn't have it
 > together enough to figure out how to use java packages they're
 > not going to have much luck with SAX anyhow.  And we really shouldn't
 > be worrying about legacy SAX implementatoins at this stage; we're
 > all bleeding-edge types around here.  And if somebody
 > wants a C binding, that's going to be different enough from 
 > Java SAX anyhow that we shouldn't do the SAX prefix just because
 > they're going to have to.

If no one wants this, I will happily remove it.  If you do want the
"SAX..." prefix on all interfaces, please speak up now.

 > > > 3. The interface for reading character streams needs more
 > > > specification if it is to be interoperable.
 > >
 > > > a) There's a critical ambiguity in the concept of a character stream:
 > > > a Java concept of a char does not correspond to the XML concept of a
 > > > character.
 > >What does everyone else think about this point?  Is this a good case
 > >for pragmatism over logical consistency, or am I introducing an ugly
 > >kludge that will come back to haunt us all?
 > Is it maybe the right thing to be brutally clear and just have a UTF-16
 > character stream?  I haven't looked at Java chars as closely as James
 > has, but his description sounds exactly like UTF-16.  A 16-bit UTF-16
 > quantity is not precisely a character, but the places where it isn't
 > (non BMP chars) exhibit graceful degradation; if the app knows about
 > UTF-16 it does the right thing, otherwise it looks like two unknown
 > characters, nothing breaks.

Fair enough -- we could specify that SAXCharacterStream is a UTF-16
stream, or we could even name it SAXUTF16Stream.  How will this
interact with Larry Wall's decision to use UTF-8 as the internal
encoding for the next Perl?

 > > > b) Is it legal for a byte order mark character to be present at the
 > > > start of the character stream? The right answer is that it should not
 > > > be legal: this should be stripped out in the byte to character
 > > > conversion process.
 > >
 > >This is a tricky point.  I had planned to leave it in -- what is the
 > >default behaviour for java.io.Reader (and for other languages with
 > >character streams)?
 > No; if there's a BOM, that should be eaten by the underlying char stream
 > machinery, which should read it and thereafter transparently swap bytes
 > or not to produce Java chars without the app having to work at it.
 > The spec is clear on this point, and at one with sensible implementation
 > practice.

Should we require all versions to use the Java byte order, or only the
Java version?

 > > > 6. I strongly object to including the name argument in
 > > > SAXEntityResolver.resolveEntity.  There's nothing in XML that says
 > > > that the name should be used in resolving an entity and so there's no
 > > > reason to suppose a parser will make it available.  I also think it's
 > > > wrong in principle to make use of it.  This business with "[document]"
 > > > and "[dtd]" is gross. At the very least the spec should say that name
 > > > maybe null if this information is not available.
 > >
 > >I'm neutral on this point, though I do agree that "[document]" and
 > >"[dtd]" are ugly.  Does anyone object to the removal of the name
 > >argument?
 > I'm with James - any use of the entity name by an application is
 > potentially actively harmful, nuke it. -Tim

That's two 'nays' and one abstention (mine).  If anyone wants to keep
the entity name argument, please put your case forward quickly.

Thanks, and all the best,


David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS