[
Lists Home |
Date Index |
Thread Index
]
- To: xml-dev@lists.xml.org
- Subject: how to best integrate XML in a programming language
- From: Burak Emir <Burak.Emir@epfl.ch>
- Date: Wed, 18 Feb 2004 11:11:11 +0100
- User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4
Hi,
related to my previous announcement ( that lacked the URL
http://scala.epfl.ch ), I have questions concerning the use of XML in
programming languages. In order to set the stage, here are three
suggestions that should connect the XML world to the world of
general-purpose programming languages like ML, C, C++, Java (using only
statically typed ones here).
I would be very happy if some experts on this list could give comments
on the vision described below. Comparable stuff is undertaken by
Microsoft Research
(http://www.extremetech.com/article2/0,3973,1441099,00.asp and then
article from Erik Meijer, Wolfram Schulte and Gavin Bierman which was
discussed very shortly on this list before), to some extent XQuery. When
it comes to types, a project again aimed at .NET is described by
Benjamin Pierce http://www.cis.upenn.edu/~bcpierce/xtatic/outline.html
<point>
No matter whether you are more concerned about documents or data,
writing code in non-XML syntax makes life easier.
</point>
<point>
No matter whether you are more concerned about documents or data,
writing XML syntax to describe XML makes life easier.
</point>
<point>
No matter which programming language you prefer, if it cannot deal with
XML it will soon be forgotten or condemned to accomodate numerical
computations only.
</point>
The first point should be clear: You want to minimize the number of
symbols to express e.g. a function call while maintaining readability.
Mathematical notation like f(x) is shorter then <apply fun="f"><var
name="x"/></apply>.
The last point is speculation.
The second point demands more reasoning. As an example, consider how
servlets used to generate HTML pages (and a lot of scripts still do
similarly):
out.println("<html>");
out.println("<head>");
/* etc */
I believe it is better to write
out.println( <html>... </html>.serialize() );
because a compiler can check many of the well-formedness constraints. It
is thus less likely that a typo will break your neck at runtime. Plus a
very sophisticated type system could even check whether your value
conforms to some type specified in DTD, Relax NG or Schema - this is
something like "built-in data binding".
In anticipation of several issues that arise when attempting to embed
XML things in a programming language, here is a non-exhaustive list:
- Where are XML literals allowed ?
(There must be a clear entry point, at which a language spec links
to the W3C recommendation. This can be problematic if the language
interprets symbols like <, or /> as tokens. In XQuery, this problem does
not arise, because a query language is by definition declarative and
result-centric... here XML literals, possibly with embedded blocks, are
just the result of queries )
- How to specify that XML literals may contain code blocks ?
(XQuery uses an escape mechanism <b> { msg } </b> with braces, where
msg is a text variable in the current scope. How can one call such an
XML literal with embedded blocks, is it an XML document, a
half-document, an XML template, an XML form, or what ?)
- How to deal with entities ?
(In programming language syntax this would correspond to constants.)
- How to deal with namespaces ?
(A natural correspondence would be java packages, or C++ namespaces)
- Canonicalization or not ? How ?
(It is crucial that values in a programming language are unambiguous
representations. Insignificant whitespace should not make two values
inequal. After all, we do not want to impose any formatting constraints
on program code, and XML literals would be part of program code)
Also in the other direction, things can get problematic:
- How does one write comments in XML literals
(that is an easy one, just use XML comments)
- Which type does the XML literal have ?
(This demands a class library that represents XML, which is ideally
not as bloated as DOM)
- How can one navigate in such a XML representation
(Ideally, an XPath like syntax or something comparable like pattern
matching is used)
If you want to give it a try how this could feel, download scala from
http://scala.epfl.ch. I am quite far from having reached definitive
answers to these questions, however if such a programming language is to
be useful, good answers need to be found.
cheers,
Burak
|