OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: transformations

[ Lists Home | Date Index | Thread Index ]
  • From: Joshua Allen <joshuaa@microsoft.com>
  • To: "'Simon St.Laurent'" <simonstl@simonstl.com>,'XML-Dev Mailing list' <xml-dev@xml.org>
  • Date: Sun, 19 Nov 2000 14:46:13 -0800

>For some reason, however, these ideas haven't really picked up mindshare. 
>I find it somewhat ironic that 'style sheets' are now proposed as a tool 
>for large-scale conversions during conversations between businesses, and 
>even that a technology whose roots were in Dynamic HTML has grown into a 
>primary interface for manipulating XML documents.  These tools have done 

This is a bit of an exaggeration.  XSL and XSLT were different activities,
and XSLT *was* primarily directed at transformation.  XSLT attempted to
be a general-purpose semi-structured data transformation language, and
is still the best thing we've got (if you value standards).  XSLT/XPath
is a nice way to *transform* semi-stuctured data; many complaints arise
when people use XSLT as a "primary interface" into XPath.  I personally
feel that making selectNodes() and selectSingleNode() standard parts of
the DOM would free people from having to use XSLT where a traditional
programming language + XPath would do better.

QUILT/XML-Query fill in somewhat where XSLT lacks as a transformation
language, and I'm not smart enough yet to opine about the best
overlap between the two.  We should be suspicious about things like
"is-a" relationships being too tightly tied to the transform languages,
though.  Doesn't it make more sense to build in semantics as a separate
layer?  For example, I find it quite easy to transform XML in ragged
hierarchies (for example, employee org chart), where multiple
is-a relationships exist, using XSLT.  The is-a relationships are
expressed with attributes in the XML representing parent-child, sort
of like id/idref was intended to do.  IMO, id/idref was a mistake, though,
because it ties semantics to the XML syntax and even XSLT has to kowtow
to those semantics -- semantics which are hopelessly inadequate in
most cases.  If I have to model my own data anyway, "was nutzen mir idref?"

The complaints against XSLT start to sound like complaints against
SQL used to sound.  Sometimes you have to use cursors; that is true.
And sometimes you need a programming language.  SQL is general-purpose,
though, and lots of people can be expected to know it, so it's worth
making something SQL when you can.  XML *needs* a generic transform
mechanism.  If the transform mechanism isn't uniform, easy to learn,
and standard, then we as an industry are going to blow it again and
lose yet another chance at reaping the benefits of self-describing data
on a large scale.

The first thing confusing people is the desire to transform into an
"unstructured" or "document" format.  When PaulT says "XSLT is Perl", 
it illuminates all of the hacks that people have to do to in order to
massage their XML into some document format.  But in the same sense,
if people were to look at SQL as a method to get relational data to
output directly to HTML, PDF, or whatever, it would be considered
a failure.  Transforms in the sense of XML should be *only* about
moving from one semistructured format to another.  Moving from
"semi-structured" to "document" is something for which there are many
solutions and few standards (and dubious demand for standards).

The other confusion comes when people attempt to integrate
semantics with the transform language, as I pointed out above.
Data modeling can layer atop most transform mechanisms
pretty easily, and ultimately it becomes important to know
things like is-a and has-a, but that should all be arbitrary
as far as the transform mechanism is concerned.

So, when you get rid of the "document output" and "semantics",
then I think you have a good basis for arguing what a general
purpose transform mechanism should look like.  In fact, even
looking at things like this, XSLT suffers from the fact that
you have to do more work to "join" data or use relationships
between data.  This is because XSL treats XML as a node-labeled 
graph, and to get edge-labeled relationships defined, you have
to do arbitrary things like id/idref.  Whenther you use Xlink,
id/idref, or something else to document these relationships, I
would argue that the transform language should be agnostic
to anything beyond the *existence* of the relationship.  The
transform should not care if it is an is-a or has-a.


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS