[
Lists Home |
Date Index |
Thread Index
]
> Proposals for XML syntax for procedural
> languages do pop up occasionally, and are smiled upon.
That's because there is something very wrong with the way these
proposals are presented. The contribution of an xml syntax for a
programming language is not all the be found in the application of this
xml syntax by a humar programmer.
In compiler construction the well-known syntax for a programming
language is usually called the "concrete syntax". The interal,
tree-like, representation of a program is called the "abstract syntax".
For example, the expression "e + 2" might be represented in an abstract
syntax tree like:
Plus
|- Var("e")
|- IntConst(2)
Or in other words:
<Plus>
<Var>e</Var>
<IntConst>2</IntConst>
</Plus>
Sometimes there is a specification of this abstact syntax and some
compilers even exchange abstract syntax trees, in some exchange format,
between components. The advantages of exchanging an abstract syntax, are
comparible to the advantages of exchanging xml between software components.
An xml syntax for a programming language should be compared to an
abstract syntax for a programming language, like used in existing
compilers and program transformation systems already.
The major contribution of having such an abstract syntax is that meta
components can exchange this representation instead of parsing the
concrete syntax again and again, which is difficult to do right. If
compilers finally start accepting abstract syntax representations of
source code, it will be much more easier to implement meta tools.
It is just a matter of time until programming languages will be
specified in a concrete syntax _and_ an abstract syntax. In fact,
schemas in some xml schema language are abstract syntax definitions.
We've been doing this in the Stratego/XT project [1] for years now. The
concrete syntax of a language is specified in the SDF2 syntax definition
formalism [2]. In attributes this concrete syntax specifies an abstract
syntax as well. From this concrete syntax definition we generate
abstract syntax definitions. The grammar is used as a contract between
software components [3]. Software components exchange abstract syntax
trees in the ATerm format [4].
James Clark is already doing this as well, not just with the RELAX NG
concrete syntax and abstract syntax: he has also created DTDinst, an
abstract syntax for the horrible concrete DTD syntax.
Cheers,
Martin Bravenboer
[1] http://www.stratego-language.org/
[2] http://www.program-transformation.org/twiki/bin/view/Tools/SDFII
[3] http://www.cwi.nl/~mdejonge/papers/GrammarsAsContracts.ps
[4] http://www.stratego-language.org/twiki/bin/view/Stratego/ATerm
|