Lists Home |
Date Index |
Bill de hÓra wrote,
> At this point, we might as well give in and use Lisp, being a natural
> fit for manipulating syntax trees. As for InnerXML, it looks like
> Lisp's read-from-string function.
Well, yes we could do that. Or we could build applications from a
mixture of Java and X/CDuce, or we could extend Java ala Xtatic, or we
could take Joe English's advice and switch to Haskell, or whatever. But
somehow _none_ of these seem to be fully adequate solutions for all
that each of them has a place.
I think the fundamental problem is that we have applications which span
multiple domains, and no one programming language is appropriate to all
of them. I think there are six common responses to this problem,
1. Build applications from components written in multiple languages
(Java+XSLT, Java+C, C+C++, C+asm, flex/yacc/ANTLR+whatever).
2. Keep adding language extensions until the all the problem domains are
covered (Xtatic for C#, innumerable ML extensions).
3. Use a language with extensible syntax (Tcl, Haskell, early Smalltalk,
4. Support first-class embedding of elements of one language in another
(HTML script/style, JSP/ASP, SQLJ, inline asm).
5. Support second-class embedding of elements of one language in
another (printfs, regexps, InnerXml).
6. Use a primary language API and live with the impendence mismatch.
I'm not entirely sure what makes any of these seem to be either an
appropriate or an inappropriate strategy in any particular situation.
(1) works for parsing: people who write parsers for languages with
non-trival syntax almost invariably use grammar specification languages
and parser generators. It also seems to work for XSLT. (2) is unpopular
outside of research environments. (3) kinda works, but is always
somewhat compromised relative to a special purpose language. (4) seems
to work well, despite being aesthetically and conceptually challenged.
(5) works well where the embedded elements are sufficiently compact to
be represented comfortably in host-language strings. (6), I guess, is
the conservative option, although it's not completely obvious exactly
what is being conserved.
It looks a little like there might be some correspondence with the
granularity at which the problem domains can be separated. Reading down
the list from top to bottom, the specialization of language for
specific domains decreases. OTOH, the granularity of mixing becomes
increasingly fine. That seems to make some sort of sense: parsing and
pure document processing are relatively well-defined and in many
applications to some extent disjoint from other application
functionality: here (1) wins. OTOH, IO is typically tightly interwoven
with an applications functionality, and (5) and (6) are the typical
So I'm wondering, why is it that XML processing largely seems to occupy
the two extreme positions: a specialized language at one end, and pure
host-language APIs at the other?