OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: SAX/C++: C++-specific design principles

[ Lists Home | Date Index | Thread Index ]
  • From: Steinar Bang <sb@metis.no>
  • To: xml-dev@ic.ac.uk
  • Date: 03 Dec 1999 11:55:19 +0100

>>>>> David Megginson <david@megginson.com>:

> 1. Use references when there can never be a null value, pointers
>    otherwise.

Sounds reasonable.

> 2. Pointers never change ownership -- if a Parser (for example) wants
>    to own an InputSource, it needs to make its own copy.  The app has
>    to free everything that it allocates, and the SAX driver, likewise.

A good basic practice.

> 3. Callbacks cannot be const, since they often change the state of the 
>    client app.

Agree.

> 4. Hold my nose and use UTF-8 rather than UTF-16, for compatibility
>    with most existing C++ code.

Disagree.  This just defer the task of decoding from UTF-8 to UTF-16,
which every forward-looking XML application eventually will have to
do.  For Asian languages this will also incur extra overhead, since
I'm lead to belive they will mostly store documents as UTF-16, so that 
we will have a UTF-16 to UTF-8 to UTF-16 transformation through the
SAX interface.

(I currently have a SAX (or "SAXoid") C++ wrapper around expat, where
I currently use plain std::string& to transfer text.  But this is just 
a transitional stage until I manage to get full wide char support in
the underlying system.  (What I send through SAX isn't UTF-8, but
ISO8859-1 with all unknown characters changed into ".", since this is
all the underlying system understands))

> 5. Use char * rather than string, to avoid forcing a lot of allocation 
>    overhead on the SAX driver.

Hm... when I wrote my expat wrapper, I didn't even stop to think about 
this, since strings are so easy to use, and it would become a string
in the first map<> lookup anyways.

But I guess late evaluation is always a good thing (I'm using this
heavily on the AttributeList, where no C++ objects will be created
until someone asks for the first attribute).

But I would rather see "const wchar_t*" (which I belive at least the
Xerces-C uses) than "const char*".

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS