OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: "Uh, what do I need this for" (was RE: XML.COM: How I Learne d to Love daBomb)

>> I have been developing for my company a kind of compiler 
>that enables us
>> to
>> embed XPath expression in Java code. We can now access any 
>random piece
>> of
>> data within a document as easily as we would have done with an object
>> model,
>> e.g. we can write things like invoice/line[5]/quantity in our Java
>> code
>> instead of invoice.getLine(5).getQuantity().
>You can't write invoice.getLine(5).setQuantity(10), nor 
>nor invoice.checkDeliveryAddressIsValid().

You're right, you can't write an equivalent expression with DOM or XPath.
But you can use an XPath expression to build a lvalue expression and call
methods on the result nodes. A compiler can generate Java code from a syntax
like :


=> compiled into something like (pseudo-code) :

// This block contains all XPath expression precompilation code
static {
	// precompile the XPath expression

public void process(Document invoice) {
	NodeEnumeration nlP=PATH1.select(invoice);
	while(nlP.hasNext()) {

Our compiler does not generate such code rightaway, because in our first
version we opted for a read-only model : documents could be read but not
modified. This does not means that we cannot modify data, just that we do it
by sending a new document with modified data to the web service responsible
of data storage. We choose this principle because we thought it would enable
to support SAX-like document processing. However, our next compiler version
(currently in development) will be able to generate code like above, since
we dropped the idea of supporting SAX-like document processing (web services
are not meant to handle 20 Mb XML documents).

>> We then developed a model in which business processes where 
>expressed in
>> XML
>> documents, possibly containing Java code fragments, that are compiled
>> into
>> Java code, then into bytecode by a standard Java compiler. The
>> compilation
>> is dynamic, and is done for performance purpose. If a scripting
>> language
>> seamlessly integrating XPath expressions existed, we could have used
>> it.
>Sounds nice, actually!

Well, it is nice to use :). Our next step is to validate the XPath
expressions based on our knowledge of the schema of the manipulated
document. It's a bit like the schema of a document was a class, and XPath
expressions were accessors. *BUT* we don't have any mapping layer between
the processing code and the XML document : the document *IS* the data to be

>> At the end, we get a dynamic processing language that can easily
>> manipulate
>> dynamic XML data in a business-oriented way (we could not do the same
>> thing
>> using XSLT, for example). The "dynamic" part here is important : it
>> means
>> that if the data changes, we are able to promptly adapt the 
>All the same things can be done more easily with simpler RPC 
>models, though.

This is a leitmotiv of this thread. I'm OK, everything is feasable using
CORBA or RPC or even bytes exchange over raw sockets. For me, the
differentiating and crucial factors are :

- ease & cost of implementation
- ease & cost of maintenance
- ease & cost of interoperability in heterogeneous environment (Microsoft +
Solaris + Linux + Tuxedo transactions + whatever...)
- performance

I have no proofs here except for a strong general feeling after one year and
a half of using XML technologies. But my feeling is that XML based solutions
have a win on the three first factors.

The performance factor is less easy to evaluate (aaaah, the joys of
comparative benchmarking...), but our experience is that our architecture is
scalable (e.g. what's cool about XML/HTTP protocols is that you can reuse
all HTTP load-balancing solutions), and its current performance level is

>> 1) We have been using this framework for one year and a half to
>> implement
>> and consume web services. In fact, we first implemented some in
>> january
>> 2000, using a proprietary SOAP-like protocol (XML over HTTP),
>> implemented
>> both in Java and Microsoft technologies. As an example of benefits of
>> this
>> approach, we managed to integrate our 100% Java server running on
>> Solaris
>> with 100% Microsoft code accessing to an Exchange server running on
>> Windows
>> NT. This is something that would have been very difficult 
>and expensive
>> to
>> implement using CORBA, DCOM or RPCs.
>DCOM, yes, since MS don't make implementations freely 
>available. But CORBA or
>ONC RPC are *platform neutral*. Why do you think it would have 
>been harder?

We implemented the exchange of structured calendar and addressbook data
between our 100% Java framework and the 100% Microsoft CDO Objects (a COM
component library used to manipulate Exchange data) in about one week for
two persons. We would certainly not have been able to be as efficient using
CORBA or RPCs, without even speaking of the maintenance of the project. I
reckon it may be feasable, but the task would have been so difficult for us
that we may not have done it at all otherwise.

>> 3) We don't have to worry about object model mappings 
>to/from databases
>> or
>> XML ; all data is represented as XML documents.
>The same can be done with any standard model; CORBA IDL will 
>do this, as will 
>ONC RPC. The object model mappings are there, so you use the 
>syntax native to
>your language (thing.getFoo(5)), but it's automatically generated.

I have used O/R mappers and EJBs in two rather big projects. I have even
implemented a custom O/R mapper on another project. Yes, the code is
automagically generated. But the lifecycle of your application becomes
complicated, and the maintenance becomes awkward ("what, I just add a field
in this table and you have to rebuild the whole application ?"). Plus,
today, it is far easier to use XML to access to some data (e.g. the Exchange
server mentionned above). So if I used a native objects methodology, I would
have to use an O/R mapper on one side, then an O/XML mapper on the other.
That's why we thought about dropping the "O" part, and only think about
XML/database mapping issues.