OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   XML Performance in Client-Server Interactions

[ Lists Home | Date Index | Thread Index ]
  • To: <xml-dev@lists.xml.org>
  • Subject: XML Performance in Client-Server Interactions
  • From: "Roger L. Costello" <costello@mitre.org>
  • Date: Thu, 11 Nov 2004 16:44:29 -0500
  • Thread-index: AcTIN58OlZe3/1IVTbS2K7UPzABRuw==

Hi Folks,
 
I am interested in knowing the state-of-the-art practice
for enhancing the performance of XML-based client-server interactions.
 
Let us consider the process of a client sending XML to a server.
Below I identify 3 "parts" to this process:
 
   Part 1: Client prepares the XML
 
   Part 2: Transmittal of the XML
 
   Part 3: Server processes the XML
 
Now let us consider each part in turn, with the goal of determining
the state-of-the-art practice for enhancing the performance of each
part.
 
Part 1: Client prepares the XML
 
At some point the client decides to compose and prepare XML for transmittal
to 
the server.
 
Compose the XML
 
The method employed to compose XML is highly variable.  For example, XML
could
be composed from a Java program, or from a database query.  I will restrict
this investigation just to considering XML composition from a database
query.  
 
The time required to compose XML from a database query will vary depending
on which database is used: Oracle, SQL Server, MySQL, native XML versus 
relational, etc.
 
Question: has anyone done a study comparing the time required to compose XML
by the different databases?
 
Prepare the XML
 
Oftentimes the client will choose to validate the XML prior to transmittal.
Validating XML could potentially take a significant amount of time.  The
time required will vary depending upon these factors:
 
- Validation language: which language you use (DTD, XML Schemas, RelaxNG,
  Schematron, OASIS CAM) can determine how long the validation will take.
 
- Parser: which parser you use (e.g., Apache Xerces, XML Spy, etc) can also
  impact the time required to validate.
 
Question: has anyone done a study comparing validation times across
validation
languages and validation times across parsers?
 
Part 2: Transmittal of the XML
 
There is a delay between the moment the client sends the XML to
the moment the server receives the XML.  
 
Assertion: the dominating factor in determining the length of the 
delay is the size of the XML[1]. Small XML chunks gets from client 
to server quicker than large XML chunks.
 
What are the options for reducing the delay?  I am aware of 4 techniques:
 
   1. Compression
   2. Binary encoding
   3. Streaming
   4. Minimize markup
 
Technique 1: Compression
 
There are numerous XML compression tools.  I will list 2 such
tools here:
 
- XMill
- Bzip
 
Technique 2: Binary encoding
 
The W3C has a XML Binary Characterization (XBC) Working Group that is
actively 
working to define a standard binary encoding for XML.  I believe that the
fruits 
of their labor will not be useable for several years.
 
Technique 3: Streaming
 
The idea of both HTML streaming as well as XML streaming is to break up into
small
chunks the data to be transmitted and then successively transmit one chunk
at a time.
 
The SAX event-based model is a form of streaming.
 
Question: is it viable to use SAX in a client-server interaction?  For
example, if
you are transmitting a SOAP message would it be reasonable to stream the
SOAP?  Is
there such a thing as "SOAP Streaming"?
 
Question: is the streaming technique viable for Web Services?
 
Technique 4: Minimize markup
 
Assertion: XML tags are the source cause for the increase in size of the XML
data.  
 
In recognition of this, one solution is to design your XML to minimize the
number 
of tags used.  One approach for doing this is to maximize the use of
attributes[2].
 
Question: is the "attribute heavy" approach an effective approach for
reducing delay?
Is it a good approach? 
 
Question: all 4 techniques above attempt to reduce the delay via reducing
the
"size" of the data.  Are there other things that can be done to the data
that 
would reduce the delay?
 
Part 3: Server processes the XML
 
The server has now received the XML.  The server may choose to validate it.
In Part 1
above we discussed the impact on time due to validation.  
 
After validating the server "processes" the XML.  Clearly, what it means to
"process" 
XML is highly variable.  I shall restrict the discussion just to storing 
the XML into a database.  This is the mirror of that considered in Part 1,
where 
we were interested in the time required to construct XML from a database
query.  The same
issues arise: what database is being used? Is the database a native XML
database or
a relational database?
 
Question: has anyone done a performance analysis of storing XML into a
database?
 
Summary
 
Above I discussed the delays introduced when a client sends XML to a server.

Below is a summary of all the delays:
 
database ---> XML ---> validate ---> transmit ---> validate ---> database
          T1       T2            T3            T2            T4
 
The time for all the delays are: T1 + 2 * T2 + T3 + T4
 
Have I missed any steps/delays?  /Roger
 
[1] Obviously there are many factors other than the size of the data
which affect the delay, such as network problems.  Those are problems
that the client has no control over. I am focused on the delays due 
to the information itself (which the client does have control over).
 
[2] Whereas elements have a start-tag/end-tag pair, attributes don't have
the concept of an "end attribute tag".  Thus, by using attributes
you can effectively reduce by half the amount of markup. 








 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS