OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: Feeler for SML (Simple Markup Language)

[ Lists Home | Date Index | Thread Index ]
  • From: Tyler Baker <tyler@infinet.com>
  • To: David Megginson <david@megginson.com>
  • Date: Thu, 11 Nov 1999 17:20:03 -0500

David Megginson wrote:

> "Hunter, David" <dhunter@Mobility.com> writes:
>
> > Perhaps from the point of view of a parser writer, this might be a
> > good thing.  If you knew you were never going to need these
> > constructs, you could build a smaller, faster parser.
>
> Not much smaller, I'm afraid -- for an event-based parser, support for
> PIs and attributes adds almost no overhead (I remember experimenting
> with putting them in and leaving them out when I was writing
> AElfred).
>
> AElfred, by the way, was under 15K in a compressed JAR file when I was
> maintaining it, though it wasn't strictly conformant (it didn't report
> all required errors) -- I still believe that someone could write a
> Java-based XML parser in under 10K (compressed) if they had the time
> and inclination and made more use of the standard Java libraries.

For Aelfred's case, it worked well for applets but would not work as well for cell phones, or
PDA's because what really counts is memory usage in these environments and regardless of
whether you use a java.util.Hashtable or your own custom version, a hashtable class and any
supporting utility classes will be loaded into memory one way or another. In this case,
writing your own smaller footprint hashtable would make more sense, so long as none of the
rest of your code made calls to libraries which loaded up a java.util.Hashtable into memory.
But since all kinds of core Java libraries use java.util.Hashtables all over the place, you
are probably better off just using java.util.Hashtable anyways.

This is one of the problems with embedded Java as I understand it because you still need to
bundle a whole bunch of unnecessary libraries with your application instead of just being
allowed to use the bare essentials that you really need.

Most of Aelfred's footprint from what I remember seemed to be character handling code and not
actual XML parsing code. If you restrict XML to be one character encoding such as UTF-8, get
rid of DTD handling (in the XML parser I have written more of my code is for parsing DTD's
than the actual parsing of an XML file) and validation, then I would not be surprised if you
could get things under 5K if you really wanted to.

I think for environments with severe memory constraints, some of Don's ideas really make
sense. Removing comments, pi's, and attributes though does not shrink your parser much as
handling each of those is only a few lines of code. Dealing with character encoding, DTD's,
and validation is where most of the bloat in an XML parser tends to go.

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS