OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xml-dev] Interoperability



From: "Joe English" <jenglish@flightlab.com>

> The main drawback to omitting <!DOCTYPE ...> declarations
> is that information about the document type has to be
> carried out-of-band (e.g., a note saying "the files
> in this directory use the TMML vocabulary, see
> <URL: http://tmml.sf.net/ >" in a README file).
> I suppose using XML namespaces would rectify this, but
> that's another barrel of worms.

I am not sure that that is the barrel of worms that is easiest
and most fruitfully dealt with first.  (Can a barrel of worms be
fruitful? )

I believe we need to approach things from a completely
different angle than just the symptoms "xmlns:* breaks
DTDs" or "DTDs (i.e. IDs) break databases" or
"database influences complicate simple XML for
data interchange" or "schema languages are too complicated" 
or "I don't need this therefore it should not exist" etc. 

I believe the root of the problem is that there is no
vendor-neutral way to distribute XML applications.
We have all sorts of formats for bits and pieces:
schemas, transformation scripts, stylesheets, digital
rights, not to mention the zillion proprietary plug-in
formats. 

Yet there is no way I can say to my friend
"Here is a file with all the resources needed for you
to work with DOCBOOK: you can just plug it into
your XML system (editor, composer, database,
web application, etc. etc.) and you can start using it
straight away."   

At the moment, we are little better than in SGML days:
we have a choice of many more tools but they still take 
a too much effort to set up. Once set up, we have 
interoperability of data between our application and 
some-one elses, but the establishment costs are still 
too high. So a lot of the potential of standard 
generalized markup languages has not been realized yet, 
because of this inflexibility.

There are three architectures we could to consider:

1) URIs
----------
The WWW architecture would have all resources
related to a document always available on servers
at absolute locations. This is good for standard resources,
but it provides no mechanism for value-adding:
we have not format to allow an integrator to make
an "XML application" to distribute to clients. 
Obviously it breaks when you disconnect from the
web too.

2) P2P
---------
Another alternative is the peer-to-peer system.
Because there is no peer-to-peer protocol that
has won the day yet, this is only useful for
vendor-specific tools at the moment. (My company
is using this method for our forthcoming product, 
for example)

3) Packaging
---------------- 
The other mechanism, and the one I think we need,
is a simple file format for packaging all the resources
for generic and value-added XML applications: an
XML application archive format (XAR Gavin has 
suggested.)   The kind of thing I am suggesting
can be found at http://www.topologi.com/public/dzip.html

Why would this move us forward?

1) It gets rid of path problems. Users just need to get the
appropriate DZIP file for deploying and configuration of
XML tools.

2) It provides the equivalent of RDDL but without
the fragility of the Web.  (A RDDL document could
point to an XAR, and an XAR could point to a RDDL
document.  A CATALOG still needs to be located 
or packaged so does not help.)

3) It allows us to transmit any kind of associated resource
connected to a document type out of band. 

4) It allows us to move towards XML portable applications
(i.e. distributable applications which use XML data).
So vendors or integrators can add their own 
customizations for multiple tools, to suit the needs of
particular customers.  

The XAR could contain not
only the schema and DTD and stylesheets and public
entities, but also the scripts or plugins or libraries used
for by different tools, e.g. an MS installer script
for some application underneath the
vendors/microsoft.com directory and a BeanShell
script for our product underneath vendors/topologi.com
and an RPM package for a Linux application
elsewhere or Oracle could put their iFS type definitions 
to map the document into their DBMS.

Or it could contain product-specific versions of
stylesheets or DTDs so that we can provide a nice
one for the browser/tools of choice and a simpler
one for others.  Or an integrator could have their own
directory to distribute updated OmniMark scripts
along with the latest versions of the DTDs.

5) If a DOCTYPE declaration is missing, then there
is an alternative source for supplying entity references.
(CATALOGs could be integrated too.)

6) By taking care of a large class of problems for
application creation and deployment, it clarifies
the areas where XML and standards above it
can be improved: they can focus on all the parts 
left over. At the moment there is a hodgepodge of
issues that everyone will just get bogged down in.

What would it take?

Step 1) Someone (OASIS? xml-dev) develops a simple
spec (like DZIP) for XAR
Step 2) Vendors add a standard directory location where
their application looks for XAR files and, if one exists,
read DTDs etc from them. (They can also look for
vendor specific plug-ins etc.)

How would users use it?  

1) I want to edit DOCBOOK. I get the docbook.xar from
docbook.org or from some value-adding vendor and stick it 
in the correct directory (a kind application might help put 
it in the correct place)
2) err thats it. My generic tools are XAR aware and I
can start work immediately.

Nice dream, but it seems like such low-hanging fruit.

Cheers
Rick Jelliffe