OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Greenspun's tenth law rears its head: was InnerXml is like

[ Lists Home | Date Index | Thread Index ]

Bill de hÓra wrote,
> At this point, we might as well give in and use Lisp, being a natural
> fit for manipulating syntax trees. As for InnerXML, it looks like
> Lisp's read-from-string function.

Well, yes we could do that. Or we could build applications from a 
mixture of Java and X/CDuce, or we could extend Java ala Xtatic, or we 
could take Joe English's advice and switch to Haskell, or whatever. But 
somehow _none_ of these seem to be fully adequate solutions for all 
that each of them has a place.

I think the fundamental problem is that we have applications which span 
multiple domains, and no one programming language is appropriate to all 
of them. I think there are six common responses to this problem,

1. Build applications from components written in multiple languages
   (Java+XSLT, Java+C, C+C++, C+asm, flex/yacc/ANTLR+whatever).

2. Keep adding language extensions until the all the problem domains are
   covered (Xtatic for C#, innumerable ML extensions).

3. Use a language with extensible syntax (Tcl, Haskell, early Smalltalk,
   Forth).

4. Support first-class embedding of elements of one language in another
   (HTML script/style, JSP/ASP, SQLJ, inline asm).

5. Support second-class embedding of elements of one language in
   another (printfs, regexps, InnerXml).

6. Use a primary language API and live with the impendence mismatch.

I'm not entirely sure what makes any of these seem to be either an 
appropriate or an inappropriate strategy in any particular situation. 
(1) works for parsing: people who write parsers for languages with 
non-trival syntax almost invariably use grammar specification languages 
and parser generators. It also seems to work for XSLT. (2) is unpopular 
outside of research environments. (3) kinda works, but is always 
somewhat compromised relative to a special purpose language. (4) seems 
to work well, despite being aesthetically and conceptually challenged. 
(5) works well where the embedded elements are sufficiently compact to 
be represented comfortably in host-language strings.  (6), I guess, is 
the conservative option, although it's not completely obvious exactly 
what is being conserved.

It looks a little like there might be some correspondence with the 
granularity at which the problem domains can be separated. Reading down 
the list from top to bottom, the specialization of language for 
specific domains decreases. OTOH, the granularity of mixing becomes 
increasingly fine. That seems to make some sort of sense: parsing and 
pure document processing are relatively well-defined and in many 
applications to some extent disjoint from other application 
functionality: here (1) wins. OTOH, IO is typically tightly interwoven 
with an applications functionality, and (5) and (6) are the typical 
strategies.

So I'm wondering, why is it that XML processing largely seems to occupy 
the two extreme positions: a specialized language at one end, and pure 
host-language APIs at the other?

Cheers,


Miles






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS