OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] XML Performance in Client-Server Interactions

[ Lists Home | Date Index | Thread Index ]

hi roger,

we now use xml messages to transmit updates between servers in a 
distributed application - in this case it is a distribution/warehouse 
system with multiple warehouses (actually that's only part of it, but it 
is a good example of what you're talking about).

the message volume is thousands per day. here's my observations.

1. i haven't measured it because it works more than fast enough. it 
appears to the users as real time. eg in this transaction: staff pick 
job, local data updated, message to office server prepared, message 
transmitted, message received, message processed and database updated - 
it happens so quick that it appears instantaneous.
2. there are longer messages - send an invoice copy to server (pdf, 
base64 encoded, embedded into xml message) - approximate size 50k - 
still less than 1 second from transaction start in warehouse to complete 
at server.
3. it's even faster the other way because the server is on a 2mbit 
symmetric service, while the warehouse is on 1.5/256 adsl (yes that's 
about normal for oz :( )
4. the architecture is this: database - unibase ;) ; message 
transmission - part of unibase, but essentailly it's a socket with a 
message server at the other end, store and forward a bit like email; 
messages are an xml format consisting of database manipulation commands 
and a facility to execute system commands.

so, for commercial real time processing it all works as expected without 
binary stuff. i'm very pleased with the xml message structure stuff. we 
used it to replace an earlier non-xml message system and this one seems 
just as fast, but much more flexible.

i know we are going to have troubles when we extend this from australia 
to china - new year project - because the latency on the internet to 
china is about 200ms (400ms round trip). this is tragic for synchronised 
messages. stuff like h323 works fine (2mb both ends) because it doesn't 
synchronise end to end during a conversation, but things like email and 
http slow to a crawl at each synchronisation point. this is a way bigger 
issue than any xml message performance issues and i'm spending a lot of 
time (and therefore money) trying to get routing tables etc changed to 
fix it.

if there's something going to affect the use of xml in other 
communication areas it's going to be this synchronisation/latency 
problem. not much point having mega mips of processing power when the 
comms lines take so long to synchronise.

we're about to write a version 2 of our message transmission system to 
add some of the high reliability stuff, i'll include performance 
measurement into it so next time you ask i can give you some real 
figures, but i don't think the processing or transmission times are 
really the issue (unless the processing is slow to begin with) - it's 
other simpler issues that need to be looked at.

regards

rick

Roger L. Costello wrote:

>Hi Folks,
> 
>I am interested in knowing the state-of-the-art practice
>for enhancing the performance of XML-based client-server interactions.
> 
>Let us consider the process of a client sending XML to a server.
>Below I identify 3 "parts" to this process:
> 
>   Part 1: Client prepares the XML
> 
>   Part 2: Transmittal of the XML
> 
>   Part 3: Server processes the XML
> 
>Now let us consider each part in turn, with the goal of determining
>the state-of-the-art practice for enhancing the performance of each
>part.
> 
>Part 1: Client prepares the XML
> 
>At some point the client decides to compose and prepare XML for transmittal
>to 
>the server.
> 
>Compose the XML
> 
>The method employed to compose XML is highly variable.  For example, XML
>could
>be composed from a Java program, or from a database query.  I will restrict
>this investigation just to considering XML composition from a database
>query.  
> 
>The time required to compose XML from a database query will vary depending
>on which database is used: Oracle, SQL Server, MySQL, native XML versus 
>relational, etc.
> 
>Question: has anyone done a study comparing the time required to compose XML
>by the different databases?
> 
>Prepare the XML
> 
>Oftentimes the client will choose to validate the XML prior to transmittal.
>Validating XML could potentially take a significant amount of time.  The
>time required will vary depending upon these factors:
> 
>- Validation language: which language you use (DTD, XML Schemas, RelaxNG,
>  Schematron, OASIS CAM) can determine how long the validation will take.
> 
>- Parser: which parser you use (e.g., Apache Xerces, XML Spy, etc) can also
>  impact the time required to validate.
> 
>Question: has anyone done a study comparing validation times across
>validation
>languages and validation times across parsers?
> 
>Part 2: Transmittal of the XML
> 
>There is a delay between the moment the client sends the XML to
>the moment the server receives the XML.  
> 
>Assertion: the dominating factor in determining the length of the 
>delay is the size of the XML[1]. Small XML chunks gets from client 
>to server quicker than large XML chunks.
> 
>What are the options for reducing the delay?  I am aware of 4 techniques:
> 
>   1. Compression
>   2. Binary encoding
>   3. Streaming
>   4. Minimize markup
> 
>Technique 1: Compression
> 
>There are numerous XML compression tools.  I will list 2 such
>tools here:
> 
>- XMill
>- Bzip
> 
>Technique 2: Binary encoding
> 
>The W3C has a XML Binary Characterization (XBC) Working Group that is
>actively 
>working to define a standard binary encoding for XML.  I believe that the
>fruits 
>of their labor will not be useable for several years.
> 
>Technique 3: Streaming
> 
>The idea of both HTML streaming as well as XML streaming is to break up into
>small
>chunks the data to be transmitted and then successively transmit one chunk
>at a time.
> 
>The SAX event-based model is a form of streaming.
> 
>Question: is it viable to use SAX in a client-server interaction?  For
>example, if
>you are transmitting a SOAP message would it be reasonable to stream the
>SOAP?  Is
>there such a thing as "SOAP Streaming"?
> 
>Question: is the streaming technique viable for Web Services?
> 
>Technique 4: Minimize markup
> 
>Assertion: XML tags are the source cause for the increase in size of the XML
>data.  
> 
>In recognition of this, one solution is to design your XML to minimize the
>number 
>of tags used.  One approach for doing this is to maximize the use of
>attributes[2].
> 
>Question: is the "attribute heavy" approach an effective approach for
>reducing delay?
>Is it a good approach? 
> 
>Question: all 4 techniques above attempt to reduce the delay via reducing
>the
>"size" of the data.  Are there other things that can be done to the data
>that 
>would reduce the delay?
> 
>Part 3: Server processes the XML
> 
>The server has now received the XML.  The server may choose to validate it.
>In Part 1
>above we discussed the impact on time due to validation.  
> 
>After validating the server "processes" the XML.  Clearly, what it means to
>"process" 
>XML is highly variable.  I shall restrict the discussion just to storing 
>the XML into a database.  This is the mirror of that considered in Part 1,
>where 
>we were interested in the time required to construct XML from a database
>query.  The same
>issues arise: what database is being used? Is the database a native XML
>database or
>a relational database?
> 
>Question: has anyone done a performance analysis of storing XML into a
>database?
> 
>Summary
> 
>Above I discussed the delays introduced when a client sends XML to a server.
>
>Below is a summary of all the delays:
> 
>database ---> XML ---> validate ---> transmit ---> validate ---> database
>          T1       T2            T3            T2            T4
> 
>The time for all the delays are: T1 + 2 * T2 + T3 + T4
> 
>Have I missed any steps/delays?  /Roger
> 
>[1] Obviously there are many factors other than the size of the data
>which affect the delay, such as network problems.  Those are problems
>that the client has no control over. I am focused on the delays due 
>to the information itself (which the client does have control over).
> 
>[2] Whereas elements have a start-tag/end-tag pair, attributes don't have
>the concept of an "end attribute tag".  Thus, by using attributes
>you can effectively reduce by half the amount of markup. 
>
>
>
>
>
>-----------------------------------------------------------------
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>
>To subscribe or unsubscribe from this list use the subscription
>manager: <http://www.oasis-open.org/mlmanage/index.php>
>
>  
>

begin:vcard
fn:Rick  Marshall
n:Marshall;Rick 
email;internet:rjm@zenucom.com
tel;cell:+61 411 287 530
x-mozilla-html:TRUE
version:2.1
end:vcard





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS