OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Fast text output from SAX?

[ Lists Home | Date Index | Thread Index ]

Dennis Sosnoski wrote:

> ...
>
>>
> I haven't looked at esXML/esDOM in any detail, but it sounds like what 
> you're doing is defining a whole different way of working with XML 
> document data. That's fine, but it doesn't really allow for direct 
> comparisons in the same terms as other approaches which preserve the 
> XML parse event stream - you're assuming (or at least suggesting) that 
> everyone will use your APIs for working with XML documents, while I'm 
> looking at the more modest issue of efficiently transporting XML 
> documents from one place to another while preserving standard APIs.

Many people use DOM as an API for business application access to 
business document/object XML.  I'm proposing a reformulation of DOM 
because DOM more or less does what I want, but without some design 
constraints that has left it unweildy and inefficient.  I expect to 
support the existing DOM and SAX2 standards also, but these are 
necessarily inefficient.

>
> To give a direct comparison with esXML/esDOM I'd need to define a 
> native API for working with the XBIS serialization of a document 
> directly. That's not something I see as worthwhile, given the wide 
> variety of APIs already available for working with XML. It'd be 
> interesting to at least see how the document size compares, though - 
> if you want to investigate, the XBIS site http://www.xbis.org 
> currently has size comparisons between text and XBIS for serveral 
> different documents and collections of documents. The documents are 
> all (except for a modified form of the XML recommendation itself, 
> which I'm prohibited from redistributing) included in the download.

No, that's not true.  One would take an application using standard, best 
practices API and methods and replace the management of data and library 
calls with the esXML model and then compare.  The code will get much 
simpler and the performance should improve.  I don't expect other code 
to be rewritten to benchmark against an esXML/esDOM combination, I 
expect the application to be rewritten to take advantage of a different 
model.  Is that a non-starter for some applications that exist?  Yes, 
but that is a requirement of complete holistic optimization.

(I do think that DOM is broken in two or more ways, but we might as well 
clean up many things while fixing that.)

In other words, a test application that does the following logically can 
be expressed using various combination of formats, APIs, and methods to 
get a most-optimal configuration for each and then be compared at an 
equal overall level.

Example workloads:

create document, insert elements/attributes/values linearly, randomly, 
reverse
output document

input document, read sequentially, randomly, reverse

input document, perform various read/update/delete ratios
output result

input document, take pieces of input and create new outputs
output results

input document, create new version as a delta
output document, delta

input document, delta, perform read/update/delete, insert, append
output new delta

Do these repititiously, with various kinds of payloads, numbers and 
length of elements, attributes, nesting, arraying, access patterns.
Cover many small document/objects being processed quickly (routing, 
stats, other kinds of applications), large objects being transformed or 
randomly accessed to do complex processing, implementation of complex 
data models (implementing, maintaining, and using a directed graph, 
dictionaries, etc.).
Create a schema/template model, including static schema and application 
channel prototyping.

Package specific combinations to mirror certain application types: web 
services of various kinds (financial, medical, messaging, search, quote, 
commerce, etc.), etc.
Use access patterns such as http1.0, http1.1 with pipelining, BEEP async 
channelized pipelined tagged request, etc.

This is what I mean.
sdw

>   - Dennis


sdw

-- 
swilliams@hpti.com http://www.hpti.com Per: sdw@lig.net http://sdw.st
Stephen D. Williams 703-724-0118W 703-995-0407Fax 20147-4622 AIM: sdw

begin:vcard
fn:Stephen Williams
n:Williams;Stephen
email;internet:sdw@lig.net
tel;work:703-724-0118
tel;fax:703-995-0407
tel;pager:sdwpage@lig.net
tel;home:703-729-5405
tel;cell:703-371-9362
x-mozilla-html:TRUE
version:2.1
end:vcard





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS