[
Lists Home |
Date Index |
Thread Index
]
> From: Roger L. Costello [mailto:costello@mitre.org]
<snip/>
> I feel like we are close to exhausting the issue of the best way to
> design an XML message. Recall the 2 approaches we have discussed:
>
> - Approach 1: should I colocate an indication of the
> desired action
> with the part of the message that the action applies to, or
> - Approach 2: should I separate the actions from the data?
I think it depends upon what you can get away with. The comments made by
others regarding the value of decoupling the action from the data were quite
valid. However, I would draw an analogy with the Composite [1] and Chain of
Responsibility [2] patterns from the OO world. A message may be a composite
of smaller messages as a way of constructing rich messages. There also may
be situation where it is convenient or useful to bundle a set of messages
together into a batch. HTTP pipelining is a valid alternative to the latter
in some instances, but there may be instances where collecting them into a
batch is more useful. For instance, if employing a message broker or doing
message queueing, it may be more convenient or more useful to queue or
publish one large message rather than 1000 small ones.
> I am hopeful that today we can come to an agreed-to best practice for
> this issue. At the very least, I would like to ensure that both
> approaches are completely understood, as well as the tradeoffs.
I'm not sure it is realistic to have one agreed-to best practice. I think
something more like the design pattern catalogs that arose in the OO world
would be better -- a collection of patterns with the pros and cons of each
approach noted.
> Yesterday Mark Baker and Michael Brennen were tossing around
> an example
> of an XML message to purchase a CD. Michael asserted that such a
> message would contain multiple actions:
>
> - a "buy this item" action,
> - a "use this credit card" action, and
> - a "ship to this address" action.
Although the question of the intent (or action) of a message depends upon
what layer in the architecture you are looking at, or what component in the
processing chain you ask. Submitting an HTML form has a "POST" action
associated with it as far as the HTTP server is concerned, but may have "but
this item", "use this credit card" actions associated with it from the
perspective of the application that processes the order (of which the HTTP
server is blissfully ignorant). There is also no reason why, IMO, a SOAP
message could not have one top-level "intent", than delegate portions of the
message to various components for processing (the Chain of Responsibility
pattern[2]), each of which may see an intent in their portion of the
message, represented via an element or attribute within the message.
I'm really arguing for flexibility, here, and the ability to pile on as many
layers as needed; not just one way of doing things.
> In previous discussions we have not looked at an example with multiple
> actions. Hopefully, such an example will provide insight into which
> approach is better. So, I would like to use this CD-purchase
> example to
> compare (once again) the 2 approaches.
>
> Here's the CD-purchase message where each action is colocated with the
> XML subtree that it applies to.
>
> <message>
> <body>
> <PurchaseOrder action="purchase-CD">
> <CD>
> <Title>Timeless Serenity</Title>
> <Author>Dyveke Spino</Author>
> <Date>1984</Date>
> <RecordingCompany>
> Dyveke Spino Productions
> </RecordingCompany>
> </CD>
> <credit-card action="bill-credit-card">
> <type>Visa</type>
> <name>John Doe</name>
> <number>1234 5678 9012 3456</number>
> <exp-date>11-03</exp-date>
> </credit-card>
> <billing-address>
> <street>101 Smith Rd</street>
> <city>Boston</city>
> <state>MA</state>
> <zip>03100</zip>
> </billing-address>
> <delivery-address action="mail-item">
> <street>101 Smith Rd</street>
> <city>Boston</city>
> <state>MA</state>
> <zip>03100</zip>
> </delivery-address>
> </PurchaseOrder>
> </body>
> </message>
>
> Note the 3 actions that are specified:
> - purchase-CD
> - bill-credit-card
> - mail-item
>
> The purchase-CD action is an attribute of the <PurchaseOrder>
> element.
> The bill-credit-card action is an attribute of the <credit-card>
> element. The mail-item action is an attribute of the
> <delivery-address>
> element.
Although you could still add a single top-level action (say
"execute-transaction" or "submit-purchase-request", or whatever) that could
facilitate dispatching by a top-level component.
>
> Questions:
>
> 1. I am having a really hard time understanding the later two
> actions.
> Of course I understand that a service would need to perform these two
> actions, but I fail to understand why a "client" would have to
> explicitly specify these actions in the XML message. Such
> "sub-actions"
> seem to be part of the semantics of the umbrella
> "purchase-CD" action.
> Can someone explain this to me? Or, is it just a poor example of a
> multi-action message? If so, please give me a better example.
I think this is a very simplistic example. This example could be easily
served by a single top-level action. This is simplistic because each of the
actions could probably be reliably inferred from a top-level action. But you
might want a server-side configuration that associates actions with
particular sub-elements so that you can employ lightweight, modular
components for processing. For instance, a component that processes the
order but does not know how to authorize a credit-card purchase, a separate
component that authorizes the credit-card purchase but does not process the
order, a component that notifies the shipping department of an order and
provides the shipping address, etc.
Where actions become relevant is when there is more than one way to
interpret a subelement and you want to have a consistent, modular way to
represent the data independently of the action that will be applied. Here's
an example modelled after the sorts of things that we are actually doing
(though the specific syntax is just off the top of my head and does not
necessarily match any of our existing integrations).
<UpdatePerson>
<MatchCode>someID</MatchCode> <!-- unique identifier assigned by
remote system for synchronization -->
<FirstName>Roger</FirstName>
<LastName>Costello</LastName>
<Addresses>
<RemoveAddress>
<ObjectID>xxxxxx</ObjectID>
</RemoveAddress>
<AddAddress>
<AddressType>home</AddressType>
<Street>123 Main St.</Street>
<City>Hometown</City>
...
</AddAddress>
<AddAddress>
<AddressType>work</AddressType>
...
</AddAddress>
</UpdatePerson>
The context in which we use something like this would be a situation where
there is a fair amount of collaborative workflow between our system and a
CRM system at a remote location. To support these collaborations, we define
a shared information model and rich messages that allow one or the other
system to synchronize the other systems view of the model. Typically, one
system is designated as the master and is the only one to make changes to
the information. The information model -- and the processing model -- is
abstracted from what is really going on. For instance, the information model
presented is hierarchical from the point of view of a message, though it is
stored in a relational db. The semantics of "RemoveAddress" is not that of
an RPC call or a simple DB insert. It simply means that the shared
information model must reflect that that address is no longer associated
with the parent Person entity. In our implementation, there are various ways
something like this might be handled. Many types of records are never
actually deleted. They just get a marker indicating they are deleted and
they don't show up in the UI anymore. But some others do actually get
deleted. It might not even be just one record, even though it looks like one
in the XML message. The AddAddress is similar. If this is a containment
relationship, this might actually add a record to the DB -- and in this
instance, the address would have to be added after the Person to ensure
referential integrity. If it is a many-to-many relationship (which is
actually the way we do addresses), it might just add a link to an existing
Address record (and likewise the RemoveAddress action might simply remove a
link rather than delete the record). If the Person-to-Address relationship
is a many-to-one relationship (not that you do that with people and
addresses, but just as an example), then adding the address would happen
*before* adding the person to ensure referential integrity. These
interpretations of "AddAddress" and "RemoveAddress" are up to us to define,
not the client. The client only cares what the shared information model
looks like after a command and all of its associated subcommands are
applied. This is not RPC, and there is no more coupling than that necessary
to support the requirements. If an integration does not require such
fine-grained synchronization of a shared information model, then we would
try to adhere to one top level action. This is actually what we used to do,
but we moved to this pattern of nested actions in order to support more
demanding requirements from customers that we not fitting well with such a
simplistic scheme.
If someone wants to show me a useful and practical way to collapse these
sorts of granular actions into a single top-level URI, I'm all ears. I'd be
happy to learn such an approach, but we have real world requirements that we
must fulfill -- and I would add that one of those requirements is to keep
things simple for customers, most of whom don't give a damn about the W3C or
the semantic web.
The above message, though, could be modelled in a more sophisticated manner
to afford looser coupling between actions and data. For instance, you could
have actions in one namespace and data in another. You could use "Address"
elements and have the action indicated by an attribute in another namespace,
or make the Address element the child of an action element. There is a lot
of flexibility, here, when you start viewing the message as a composite
rather than a monolithic whole.
> Let's turn to the other approach - separate the action from the data.
> One of the purported disadvantages of this approach was that it may be
> difficult/impossible to express actions of various parts of the
> message's data. However, yesterday Christian Nentwich proposed an
> elegant solution - using XLink/XPointers to link an action to
> its data.
Or, alternatively, just IDREFs.
> (I like this idea!). That's the approach I have taken below.
<snip/>
> 1. One of the arguments against this approach (separating the action
> from the data) is that it didn't allow "rich messages where multiple
> actions are specified". However, as we see here, with the XLink
> approach we can have the message richness. Thus, doesn't
> this approach
> have all the benefits of the other approach, without its
> disadvantages?
> Am I missing something?
No. I think this approach can support the same richness. It's essentially
the same as what I was speaking of; just different syntax. The syntax can
get messy, though. I think if I offered this to customers, many would balk
and want to make things simpler (and keep the actions close to the data with
which they are associated). Keep in mind, most of our customers don't even
want to use namespaces (though that is starting to change a bit; we have
started getting customers who are expressly interested in using namespaces
and XML Schema, and even have had one that explicitly asked for a SOAP
integration). We do what we can to appease our customers. For those who
don't mind the extra syntactic clunkiness, though, I'd say go for it! I'd be
happy to provide such an approach to a customer who wanted it.
[1]
http://www.fh-konstanz.de/studium/ze/cim/projekte/osefa/patterns/compos/inte
nt.htm
[2]
http://www.fh-konstanz.de/studium/ze/cim/projekte/osefa/patterns/chain/inten
t.htm
|