[
Lists Home |
Date Index |
Thread Index
]
- To: xml-dev@lists.xml.org
- Subject: Limitations of XML as a remote service medium?
- From: Chimezie Ogbuji <chimezie@gmail.com>
- Date: Fri, 26 Aug 2005 16:53:35 -0400
- Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=lFE8j13BV3jnQONc9jDQrPZ+JiFv3Ytq/tnzoHWCr65hgyqJsAr1FiEUOUy3og0mESCiMXsKzKCaUHXkiAS43d80BBoSKLVfJEtfr3JEBkxBPfRvWx/84wOE9Ep0bkulvV4el0Lz6JlBdsLNmWErUu6M4IylKFlGn8QS0LuWYmY=
I wanted some feed back on what seems (from having wrestled with the
issue for some time) like an architectural limitation to XML as a
means to transmit remote procedures invokations. I wrote a bit about
it here (http://copia.ogbuji.net/blog/2005-08-19/BinaryEncodingAndXMLRPCs)
and how it relates to 4Suite here
(http://lists.fourthought.com/pipermail/4suite-dev/2005-August/003684.html.
I've generally avoided the whole SOAP/REST argument mostly because it
seemed more to do with two different approaches to remote services
than architectural limitations on either side, but I've recently come
across a scenario where I've had to consider the merits of both.
Essentially, I have a need to be able to manage XML documents remotely
and the following general services/methods are neccessary (in all the
methods, 'path' is the URI to the XML document in the repository to
perform the method on):
- xPath(path,expression)
expression is the XPath expression to evaluate, the nodeset or
string/number is returned
- fetchResource(path)
- xUpdate(path,updateSrc)
updateSrc is the serialized XUpdate document to use to modify the resource
- delete(path)
- setContent(path,src)
src is the serialized document to update the resource with
They are currently modelled as SOAP methods invoked at a single SOAP
endpoint. There are two practical issues with modeling it this way
that do seem to be specific to SOAP as the communication medium:
1) The redundancy of submitting either of these message to a single
end point (identified by URL) and the additional 'path' argument to
specify which resource to apply the method. All of these methods are
invoked on an identifyable (by URI) resource in the repository, so
this alone seems to suggest that REST is the preferred way to deploy
the services, since it's a resource-oriented architecture.
2) This is the real show stopper for me. Basically, having to submit
XML content within a SOAP message is problematic where the XML content
is quite large (which is the case in my scenario). This is less an
issue with the xUpdate method (since most modifications serialized as
XUpdate documents will dwarf the original documents by some minimal
order of magnitute) than with the setContent method. The setContent
method can either embed the document in it's entirety, like so:
<SOAP:Envelope>
<SOAP:Body>
<foo:setContent>
<path>/path/to/foo</path>
<src>
<documentElement>
.....
</documentElement>
</src>
</foo:setContent>
</SOAP:Body>
</SOAP:Envelope>
or encode the document in a portable binary format (base64 is what I
decided upon in my case) that can be embedded into an XML document
without violating well-formedness. The former option puts a strain on
the service endpoint which will then have to parse a large XML
document in order to decipher the message transmitted (which will only
be a very small part of the entire document). This is an issue *only*
because the remote method invokation medium is XML and so it seems
like the wrong architecture for this specific scenario: invoking
methods that transmit massive XML as parameters to a service. The
latter option (which is shown below) solves the problem of embeding an
XML document but then shifts more responsibility to the client as well
as the repository by having both have to be able to handle the
decoding/decryption (which doesn't scale very well when you consider
the verbosity of XML):
<SOAP:Envelope>
<SOAP:Body>
<foo:setContent>
<path>/path/to/foo</path>
<src>.. xml document serialization encoded as base64 ..</src>
</foo:setContent>
</SOAP:Body>
</SOAP:Envelope>
On the flip side, however, there are some issues as well with
deploying this scenario within the REST framework. First though,
below is an idea of each of these services would be ported:
- fetchResource(path) ==> HTTP GET submitted to the resource directly
- xPath(path,expression) ==> HTTP GET submitted to the resource with
a xpath=... uri argument (in order to distinguish it from the previous
request)
- xUpdate(path,updateSrc) ==> HTTP POST submitted to the resource with
the XUpdate document as the request body
- delete(path) ==> HTTP DELETE submitted to the resource directly
- setContent(path,src) ==> HTTP PUT submitted to the resource
directly with the new XML serialization submitted as the body of the
request
Seems straight forward and even simpler but there are two questions/issues:
1) It seems xPath,fetchResource are distinct enough to require a
seperate service for each but the only HTTP method that matches their
semantics is GET. Is the use of xpath as a uri argument a good REST
principle in this case? It seems very hackish (for a lack of a better
word) but I guess in the case of a pure GET you would be requesting a
complete 'representation' of the resource whereas when an xpath
argument is appended, you would be requesting a 'subset' of the
complete representation (a seperate representation in itself).
2) Are the semantics of PUT/POST distinct enough to justify using POST
for the xUpdate submission and PUT for the setContent submission?
Judging from a section of the HTTP spec, it seems so:
"The fundamental difference between the POST and PUT requests is
reflected in the different meaning of the Request-URI. The URI in a
POST request identifies the resource that will handle the enclosed
entity. That resource might be a data-accepting process, a gateway to
some other protocol, or a separate entity that accepts annotations.
In contrast, the URI in a PUT request identifies the entity enclosed
with the request -- the user agent knows what URI is intended and the
server MUST NOT attempt to apply the request to some other resource.
If the server desires that the request be applied to a different URI,"
So in the POST scenario, the resource itself is identified as a
capable of handling the 'enclosed entity' (in this case a
representation of some modifications to make to itself) and in the PUT
scenario the request-uri identifies the document transmitted.
Thoughts?
Chimezie
|