OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: Request for Discussion: SAX 1.0 in C++

[ Lists Home | Date Index | Thread Index ]
  • From: John Aldridge <john.aldridge@informatix.co.uk>
  • To: xml-dev@ic.ac.uk
  • Date: Tue, 14 Dec 1999 12:55:27 +0000

At 13:10 14/12/99 +0100, Steinar Bang <sb@metis.no> wrote:
>>>>>> John Aldridge <john.aldridge@informatix.co.uk>:
>
>> Why?  What's wrong with storing UTF-16 encoded data in a 32 bit
>> wchar_t?  I know it uses more storage space; but there won't
>> typically be that much data around in this format at once.
>
>We store a lot of strings, so I think a quadrupling of the storage
>space compared with what we do today, or doubling wrt. to UTF-16, will
>be significant.

I'm guessing that this will be fairly unusual, though.  I suspect that most
clients of such a streaming interface will be processing the data on the
fly, and not hanging on to large chunks of it for the duration of the
program run.

Of course, you don't have to store the strings in your data structures in
the same format as they are passed to you from SAX.

>> I'd much rather have the format defined to be wstring (or wchar_t*, if you
>> must, but that's another debate), because of the compatibility with wide
>> string literals.
>
>Hm... I don't know anything about wide string literals and their
>behaviour wrt. to wstring, text editors and debuggers.  Could you
>elaborate, maybe...?

Brief summary:

    L'a'   is a wchar_t containing the character 'a'
    L"abc" is s wchar_t[] containing the characters 'a', 'b', 'c', '\0'

basic_string<wchar_t> (aka wstring) has constructors and comparison
operators and the like which take wchar_t* arguments.

It seems to me that code like:

void DocumentHandler::startElement (
    const std::wstring &name, const AttributeList &atts)
{
    if (name == L"Paragraph") ...
}

is going to be a whole lot neater than

void DocumentHandler::startElement (
    const std::basic_string<SAXChar> &name, const AttributeList &atts)
{
    static const SAXChar paraString[] =
        {'P','a','r','a','g','r','a','p','h',\0'};
    if (name == paraString) ...
}
-- 
Cheers,
John

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS