[
Lists Home |
Date Index |
Thread Index
]
- To: xml-dev@lists.xml.org
- Subject: Re: [xml-dev] how to best integrate XML in a programming language
- From: Burak Emir <Burak.Emir@epfl.ch>
- Date: Tue, 09 Mar 2004 15:22:54 +0100
- In-reply-to: <40334E40.4080900@cis.strath.ac.uk>
- References: <40333A3F.3090201@epfl.ch> <40334E40.4080900@cis.strath.ac.uk>
- User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4
Hello,
Thanks to all who replied. I try to give a ( biased ) summary of the
last posts in this thread. Maybe there are even more ideas, or issues
that I overlooked ?
(A) Use XML syntax for XML data. Allow XML everywhere where data
expressions are allowed.
I disagree with Mark, who said
(http://lists.xml.org/archives/xml-dev/200402/msg00477.html):
>It is unrealistic that a program wants to embed a literal that is an
>entire XML document. It would be more likely that a program wants to
>output an XML document composed of XML literals and data values from
>the running program. Therefore, it is highly unlikely that the
>compiler could do much in the way of ensuring schema conformance.
>
>
>
First, I do not find it unrealistic - especially if XML syntax is used,
then there is little difference between a source file document and an
XML document. Also, we still have the possibility of parsing external
documents and store them in a variable if things get too big. But I
agree to the second issue
(B) Allow data expressions and control statements to be embedded in the XML
This can happen either by giving a grammar XQuery, or by adapting
surface syntax (like in scheme).
However you need an escape character, in both element content and
attribute content.
val content =
<mytag>{ MyFormat.header("mytitle") }
<p>
Check <a href={obj.theLink}>this</a>
</p>
<p>{ " this should be escaped < > & at some point " }
</mytag>
(this one is like XQuery, James' example in scheme used '(' instead.)
- This poses the minor problem of how to write this escape character in
text. In XQuery, one can just write {{ for {.
- Also one should still be able to write comments (be they retained or
not). So one should use XML comments whenever XML is
expected and /* language comments */ when other expressions may appear.
(C) XML data should be typed
- the type should have an implicit conversion to String
- it should at least expose a labelled trees, i.e. class Element(
label:String, children:List[Element] ); ( S-expressions are roughly the
same )
- static typechecking of DTD, Schema, RELAX NG Schema would be good.
On the last point, it is not "highly unlikely" (Mark), but an emerging
research topic. Pointers to Benjamin Pierce's and Erik Meijer et al's
pages were in my initial post.
It is the subtle points that demand attention:
(Entities:)
Clearly, if one has variables, one does not need parsed entities
(James), i.e.
<p> { theStuff } </p> instead of <p> &theStuff; </p>
where 'theStuff' is in the current scope. However, should the same be
done with character entities ?
- either one leaves them unexpanded, they are just text
- or one has to expand them, in which case they have to be declared
somewhere. Where, if there is no DTD ?
(Namespaces:)
- If the language has packages, then namespaces should be something like
that. However
- packages must be *renamable*, because namespaces in reality are URLs.
(Canonicalization:)
- should not be default, because cost outweighs benefit (James' post)
However
- some definition of structural equality is needed, which should
probably be "equal up to whitespace".
(Navigation:)
- some XPath, or pattern matching based scheme to decompose and navigate
cheers,
Burak
|