OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Fast text output from SAX?

[ Lists Home | Date Index | Thread Index ]

Elliotte Rusty Harold wrote:

>Bob. You may not need to be lectured on this, but some other people 
>do,as the plethora of software that crashes on unexpected input 
>proves. It has been proposed in this very thread to use binary 
>formats precisely to avoid the overhead of checking for data 
>correctness. Just slam some bits into memory and assume everything is 
>hunky dory. I have seen any number of binary formats that achieve 
>speed gains precisely by doing this. And it is my contention that if 
>this is disallowed (as I think it should be) much, perhaps all, of 
>the speed advantages of these binary formats disappears.
Actually the speed advantages wouldn't be significantly changed, at 
least not for XBIS. Since XBIS already uses handles to refer to names 
it'd only need to verify the characters of a name the first time it sees 
it; this would be very low overhead for most documents, where a limited 
set of (element and attribute) names are used throughout the document 
(which is the whole reason the handle approach is used in the first 
place). XBIS already scans the characters of content, too, so it'd just 
need to add a single conditional check in most cases to make sure a 
character is legal. What else would need to be checked? Attribute 
uniqueness could be handled by a fast hash index into an array of 
booleans, with full comparions only needed on collisions. Those are the 
main issues that come to mind for me.

Most of the well-formedness issues of text XML (start/end tags missing 
or out of order, attribute quoting errors, etc.) are impossible to 
represent in XBIS format in the first place. I'd estimate that full 
well-formedness checking wouldn't add more than 10% overhead to XBIS 
performance. Of course, I fully expect you'll dispute this, Elliotte... :-)

  - Dennis

Dennis M. Sosnoski
Enterprise Java, XML, and Web Services
Training and Consulting
Redmond, WA  425.885.7197


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS