Planet Stratego
You have a term and it don't look good, who're you gonna call?
June 10, 2009
Our paper on StringBorg is being published by Science of Computer programming:
M. Bravenboer, E. Dolstra, and E. Visser. Preventing Injection Attacks with Syntax Embeddings. A Host and Guest Language Independent Approach. Science of Computer Programming, 2009.
StringBorg is a technique for embedding 'string' languages in general purpose languages in a safe way, to avoid injection attacks.
The paradigmatic example is the embedding of SQL queries, which typically is done using string literals as in the following example:
String userName = getParam("userName");
String password = getParam("password");
String query = "SELECT id FROM users "
+ "WHERE name = ’" + userName + "’ "
+ "AND password = ’" + password + "’";
if (executeQuery(query).size() == 0)
throw new Exception("bad user/password");
In these approaches it is very easy to forget to escape SQL meta characters in the values obtained from the client. This opens the
door to an attack through a query that escapes from the programmed query.
StringBorg prevents such attacks by syntactically embedding the query language in the host language. For example, the query
above can then be written as follows:
SQL q = | SELECT id FROM users
WHERE name = ${userName} AND password = ${password} |>;
if (executeQuery(q.toString()).size() == 0) ...
Now, the syntax of the query is checked statically. But more importantly, at run-time the query is constructed by a query
API that ensures that the query constructed has the same syntactic structure as the one defined by the programmer.
Furthermore, it enforces escaping meta-characters in values spliced into the query, thus guaranteeing that no injection
attacks can occur.
The paper does not just provide a solution for embedding SQL in Java, but offers a generic approach for embedding
any
guest language in
any host language with little more effort than providing syntax definitions for host and guest language.
Abstract: Software written in one language often needs to construct sentences in another language, such as SQL queries, XML output, or shell command invocations. This is almost always done using unhygienic string manipulation, the concatenation of constants and client-supplied strings. A client can then supply specially crafted input that causes the constructed sentence to be interpreted in an unintended way, leading to an injection attack. We describe a more natural style of programming that yields code that is impervious to injections by construction. Our approach embeds the grammars of the guest languages (e.g. SQL) into that of the host language (e.g. Java) and automatically generates code that maps the embedded language to constructs in the host language that reconstruct the embedded sentences, adding escaping functions where appropriate. This approach is generic, meaning that it can be applied with relative ease to any combination of context-free host and guest languages.
June 10, 2009 08:38
May 30, 2009
For the participants of our hands-on tutorial on
Creating DSLs with Stratego/XT
at the
Code Generation 2009 conference,
we (i.e. Rob Vermaas) created a
VirtualBox
image with all the software needed during the tutorial.
In particular, it contains a full installation of
Stratego/XT
with compiler, libraries, and auxiliary packages such as java-front.
In addition, the image contains a built-from-source installation of
the WebDSL
language for building web applications, and an installation of tomcat
for deploying created web apps.
The image and instructions for its installation are available from
http://strategoxt.org/Stratego/CodeGeneration2009Tutorial
During the tutorial additional material will be handed out with concrete exercises.
The virtual machine is also useful for exploring Stratego/XT and WebDSL outside the context of this particular tutorial.
May 30, 2009 17:41
May 23, 2009
Spoofax/IMP is a toolset for the creation of interactive development environments for custom languages based on domain-specific languages for editor services. The toolset is especially aimed at the developers of domain-specific languages, allowing them to provide IDE support for their specialist language under development. An important feature of Spoofax/IMP is the support for language composition, i.e. for languages consisting of multiple, syntactically different, sub-languages. Furthermore, the toolset allows the customization of heuristically generated editor services without loosing the ability to regenerate these services when a language evolves.
At
LDTA 2009 we presented a paper about Spoofax/IMP. The final version of that paper is now finished, and a pre-print is available.
L. C. L. Kats,
K. T. Kalleberg, and
E. Visser. Domain-Specific Languages for Composable Editor Plugins.
In T. Ekman and J. Vinju, editors, Proceedings of the
Ninth Workshop on Language Descriptions, Tools, and Applications (LDTA 2009),
Electronic Notes in Theoretical Computer Science. Elsevier Science Publishers, April 2009.
[
pdf]
Abstract:
Modern IDEs increase developer productivity by incorporating many
different kinds of editor services. These can be purely syntactic,
such as syntax highlighting, code folding, and an outline for
navigation; or they can be based on the language semantics, such as
in-line type error reporting and resolving identifier declarations.
Building all these services from scratch requires both the extensive
knowledge of the sometimes complicated and highly interdependent APIs
and extension mechanisms of an IDE framework, and an in-depth
understanding of the structure and semantics of the targeted language.
This paper describes Spoofax/IMP, a meta-tooling suite that provides
high-level domain-specific languages for describing editor services,
relieving editor developers from much of the framework-specific
programming. Editor services are defined as composable modules of
rules coupled to a modular SDF grammar. The composability provided by
the SGLR parser and the declaratively defined services allows embedded
languages and language extensions to be easily formulated as
additional rules extending an existing language definition. The
service definitions are used to generate Eclipse editor plugins. We
discuss two examples: an editor plugin for WebDSL, a domain-specific
language for web applications, and the embedding of WebDSL in
Stratego, used for expressing the (static) semantic rules of WebDSL.
May 23, 2009 11:29
May 10, 2009
We just got the notification that our submission to OOPSLA 2009 has been accepted.
The paper presents a solution to error recovery for the SGLR parsing algorithm. Here's the full citation and abstract (pre-print will follow later):
Lennart C. L. Kats, Maartje de Jonge, Emma Nilsson-Nyman, and Eelco Visser.
"Providing Rapid Feedback in Generated Modular Language Environments. Adding Error Recovery to Scannerless Generalized-LR Parsing"
In Gary T. Leavens, editor, Proceedings of the 24th ACM SIGPLAN Conference on Object-Oriented Programing, Systems, Languages, and Applications (OOPSLA 2009), New York, NY, USA, October 2009. ACM. (to appear).
Abstract: Integrated Development Environments (IDEs) increase programmer
productivity, providing rapid, interactive feedback based on the
syntax and semantics of a language. A heavy burden lies on developers
of new languages to provide adequate IDE support. Code generation
techniques provide a viable, efficient approach to semi-automatically
produce IDE plugins. Key components for the realization of plugins are
the language's grammar and parser. For embedded languages and
language extensions, constituent IDE plugin modules and their grammars
can be combined. Unlike conventional parsing algorithms, scannerless
generalized-LR parsing supports the full set of context-free grammars,
which is closed under composition, and hence can parse language
embeddings and extensions composed from separate grammar modules. To
apply this algorithm in an interactive environment, this paper
introduces a novel error recovery mechanism, which allows it to be
used with files with syntax errors -- common in interactive
editing. Error recovery is vital for providing rapid feedback in case
of syntax errors, as most IDE services depend on the parser -- from
syntax highlighting to semantic analysis and cross-referencing. We
base our approach on the principles of island grammars, and
automatically generate new productions for existing grammars, making
them more permissive of their inputs. To cope with the added
complexity of these grammars, we adapt the parser to support
backtracking. We evaluate the recovery quality and performance of our
approach using a set of composed languages, based on Java and
Stratego.
May 10, 2009 22:19
May 02, 2009
I have been playing around for a couple of minutes with yUML, an online service by Tobin Harris for creating UML diagrams using a textual input language.
The diagram below is generated while you load this page.
The input needed to generate the diagram is the following list of relations:
[Publication]++->*[Author],
[AbstractAuthor]^[Author],
[AbstractAuthor]*->1[Person],
[AbstractAuthor]*->1[Affiliation],
[Person]*->*[Publication],
[Publication]^[PrintPublication],
[PrintPublication]^[Article],
[PrintPublication]^[InProceedings],
[Publication]^[PublishedVolume],
[PublishedVolume]^[Proceedings],
[InProceedings]->[Proceedings],
[AbstractAuthor]^[Editor],
[PublishedVolume]++->*[Editor],
[Person]*->*[PublishedVolume],
[PublishedVolume]^[Book],
[PrintPublication]^[InCollection],
[Book]->*[InCollection] .
This diagram documents a (small) subset of the data model underlying the
researchr.org application for bibliography sharing and reviewing.
*[Author], [AbstractAuthor]^[Author], [AbstractAuthor]*->1[Person], [AbstractAuthor]*->1[Affiliation], [Person]*->*[Publication], [Publication]^[PrintPublication], [PrintPublication]^[Article], [PrintPublication]^[InProceedings], [Publication]^[PublishedVolume], [PublishedVolume]^[Proceedings], [InProceedings]->[Proceedings], [AbstractAuthor]^[Editor], [PublishedVolume]++->*[Editor], [Person]*->*[PublishedVolume], [PublishedVolume]^[Book], [PrintPublication]^[InCollection], [Book]->*[InCollection] ." />*[Author], [AbstractAuthor]^[Author], [AbstractAuthor]*->1[Person], [AbstractAuthor]*->1[Affiliation], [Person]*->*[Publication], [Publication]^[PrintPublication], [PrintPublication]^[Article], [PrintPublication]^[InProceedings], [Publication]^[PublishedVolume], [PublishedVolume]^[Proceedings], [InProceedings]->[Proceedings], [AbstractAuthor]^[Editor], [PublishedVolume]++->*[Editor], [Person]*->*[PublishedVolume], [PublishedVolume]^[Book], [PrintPublication]^[InCollection], [Book]->*[InCollection] ." />*[Author], [AbstractAuthor]^[Author], [AbstractAuthor]*->1[Person], [AbstractAuthor]*->1[Affiliation], [Person]*->*[Publication], [Publication]^[PrintPublication], [PrintPublication]^[Article], [PrintPublication]^[InProceedings], [Publication]^[PublishedVolume], [PublishedVolume]^[Proceedings], [InProceedings]->[Proceedings], [AbstractAuthor]^[Editor], [PublishedVolume]++->*[Editor], [Person]*->*[PublishedVolume], [PublishedVolume]^[Book], [PrintPublication]^[InCollection], [Book]->*[InCollection] ." />
May 02, 2009 09:28
April 30, 2009
I have been invited to give a talk at the
Sixth International Workshop on Web Information Systems Modeling (WISM 2009),
which will be held in June in Amsterdam (co-located with CAiSE 2009).
Here's the abstract I wrote for the talk.
Abstract:
In this talk I give an overview of the design and application of
WebDSL, a domain-specific language for data centric web applications.
WebDSL linguistically integrates the definition of data models, user
interfaces, actions, access control rules, data validation rules,
styling rules, and workflow definitions. While maintaining separation
between these concerns through specialized sub-languages, linguistic
integration ensures static consistency checking and correct code
generation. The language allows developers to concentrate on the
essential design of web applications, abstracting from accidental
complexity, such as the details of data persistence. The combination
of high-level and low-level constructs ensures high expressivity,
while supporting customization to application requirements. The
application of WebDSL is illustrated using the researchr.org
application for bibliography sharing and reviewing.
Links:
April 30, 2009 09:38
April 13, 2009
It was probably unavoidable. Tweetie app on iPod makes it easy. It seems I'm tweeting.
Blogging for the lazy. Let's see if I have more to say, or more frequently at least, on twitter than on this blog.
April 13, 2009 12:23
April 06, 2009
At the Code Generation 2009 conference,
Lennart Kats and I will give a hands-on tutorial
about building DSLs with Stratego/XT. The program says it thus:
Stratego/XT is a state-of-the art language and toolset for the development of domain-specific language implementations. In this session the participants learn to use Stratego/XT, by developing a generator for a small DSL for web applications generating PHP. The tutorial covers declarative syntax definition with SDF, code generation by model transformation, and model-to-model transformation by rewriting. The tutorial is based on a course in model-driven software development developed at Delft University of Technology.
NB Since this is a hands-on session places are strictly limited. Please let us know whether you plan to attend this session when you book your conference place. Places will be allocated on a first-come first-served basis.
April 06, 2009 20:35
March 22, 2009
After some recent activity on the psat-dev mailinglist I became aware of the (lack of) available builds for php-sat. Even though the development speed is not what I would want it to be (so much fun things to do, so little hours in a day!) I still believe it is important to release early and often.
Fortunately, Eelco Dolstra had some time to migrate php-front, php-sat and php-tools to Hydra, the new Nix-based continuous build system. After some tweaking we now again have access to unstable build for all PSAT-projects. Go Hydra!

March 22, 2009 13:10
March 05, 2009
These days I am writing a book on ‘domain-specific language engineering’ for use
in a master’s course at Delft University. The book is about the design and implementation
of domain-specific languages, i.e. the definition of their syntax, static
semantics, and code generators. But it also contains a dose of linguistic reflection,
by studying the phenomenon of defining languages, and of defining languages for
defining languages (which is called meta-modelling these days).
Writing chapters on syntax definition and modeling of languages, takes me
back to my days as a PhD student at the University of Amsterdam. Our quarters
were in the university building at theWatergraafsmeer, which was connected to the
CWI building via a bridge. Since the ASF+SDF group of Paul Klint was divided
over the two locations, meetings required a walk to the other end of the building.
So, I would regularly wander to the CWI part of the building to chat.
[While the third application we learned to use in our Unix course in 1989 was talk, with which
one could synchronously talk with someone else on the internet (the first application was probably
csh and the second email), face-to-face meetings were still the primary mode of communication;
as opposed to the use of IRC to talk to one’s officemate.]
Often I
would look into Jan Heering’s office to say hi, and more often than not would end
up spending the rest of the afternoon discussing research and meta-research.
One of the recurring topics in these conversations was the importance of examples.
Jan was fascinated by the notion of ‘programming by example’, i.e. deriving
a program from a bunch of examples of its expected behaviour, instead of a rigorous
and complete definition for all cases. But the other use of examples was for
validation, a word I didn’t learn until long after writing my thesis.
The culture of the day (and probably location?) was heavily influenced by
mathematics and theoretical computer science. The game was the definition, preferably
algebraically, of the artifacts of interest, and then, possibly proving interesting
properties. The application minded would actually implement stuff. As a language
engineer I was mostly interested in making languages with cool features. The motivation
for these features was often highly abstract. The main test example driving
much of the work on the ASF+SDF MetaEnvironment was creating an interactive
environment for the Pico language (While with variable declarations). The idea
being that once an environment for Pico was realized, creating one for a more realistic
language would be a matter scaling up the Pico definition (mere engineering).
Actually making a language (implementation) and using that to write programs
would be a real test. To be fair, there were specifications of larger languages undertaken,
such as ones of (mini-) ML [8] and Pascal [6]. As a student I had developed a
specification of the syntax and static semantics of the object-oriented programming
language Eiffel [16], but that was so big it was not usable at the Sun workstations
we had at that time.
Time and again, Jan Heering would stress the importance of real examples to
show the relevance of a technique and/or to discover the requirements for a design.
While I thought it was a cool idea, I didn’t have examples. At least not to sell the
design and implementation of SDF2, the syntax definition formalism that turned
out to be the main contribution of my PhD thesis [20].
Continue reading "Example-Driven Research"
March 05, 2009 10:20
February 25, 2009
If you're doing research into domain-specific languages, model-driven engineering, or program generation, your agenda for the coming months is set. Early October the three main conferences on these topics are co-located in Denver. The deadlines are somewhat spread, so you should be able to submit a paper to each conference:
May 10:
Model Driven Engineering Languages and Systems (MODELS'09)
May 18:
Generative Programming and Component Engineering (GPCE'09)
July 10:
Software Language Engineering (SLE 2009)
I'm looking forward to your submission, and to meeting you in Denver.
February 25, 2009 14:50
February 01, 2009
Lennart Kats, Eelco Visser and myself just got a paper accepted to LDTA'09. The paper is about declarative languages for describing programming editors. The main part of it is Lennart's work, but it's running on top of the Spoofax transformation infrastructure. The idea is simple: You don't want to fight with Java, complicated APIs and complicated XML when you implement an Eclipse-based editor for your DSL. Instead, you describe your language's grammar with SDF, provide some auxiliary information using our declarative editor languages, and Spoofax/IMP does the rest by generating the editor engine for you.
The abstract explains it in the usual academic style:
Modern IDEs increase developer productivity by incorporating many different kinds of editor services. These can be purely syntactic, such as syntax highlighting, code folding, and an outline for navigation; or they can be based on the language semantics, such as in-line type error reporting and resolving identifier declarations. Building all these services from scratch requires both the extensive knowledge of the sometimes complicated and highly interdependent APIs and extension mechanisms of an IDE framework, and an in-depth understanding of the structure and semantics of the targeted language.
This paper describes Spoofax/IMP, a meta-tooling suite that provides high-level domain-specific languages for describing editor services, relieving editor developers from much of the framework-specific programming. Editor services are defined as composable modules of rules coupled to a modular SDF grammar. The composability provided by the SGLR parser and the declaratively defined services allows embedded languages and language extensions to be easily formulated as additional rules extending an existing language definition. The service definitions are used to generate Eclipse editor plugins.
We discuss two examples: an editor plugin for WebDSL, a domain-specific language for web applications, and the embedding of WebDSL in Stratego, used for expressing the semantic rules of WebDSL.
Once I get bibtex-tools running on 64bit again, I'll link to the bib and pdf.
February 01, 2009 18:41
January 17, 2009

Like last year, I'm going to FOSDEM. Last year, I traveled from Paris with the express train (great experience). This year, it's back to planes again. Not looking forward to the security hysteria.
If you're interested in meeting me there, don't hesitate to fire off an e-mail.
January 17, 2009 21:55
December 19, 2008
As mentionted before, we've been doing some real parsing research to better support parsers for extensible languages. Parse table composition provides separate compilation for syntax components such that syntax extensions can be provided as plugins to a compiler for a base language. Due to various distractions last Summer I seem to have forgotten to blog about the paper that Martin Bravenboer and I got accepted at the first international conference on Software Language Engineering (which Martin was looking forward too).
M. Bravenboer and E. Visser. Parse Table Composition. Separate Compilation and Binary Extensibility of Grammars. In D. Gasevic and E. van Wyk, editors,
First International Conference on Software Language Engineering (SLE 2008). To appear in Lecture Notes in Computer Science, Heidelberg, 2009. Springer.
[
pdf]

Abstract:
Module systems, separate compilation, deployment of binary
components, and dynamic linking have enjoyed wide acceptance in
programming languages and systems. In contrast, the syntax of
languages is usually defined in a non-modular way, cannot be
compiled separately, cannot easily be combined with the syntax of
other languages, and cannot be deployed as a component for later
composition. Grammar formalisms that do support modules use whole
program compilation.
Current extensible compilers focus on source-level extensibility,
which requires users to compile the compiler with a specific
configuration of extensions. A compound parser needs to be
generated for every combination of extensions. The generation of
parse tables is expensive, which is a particular problem when the
composition configuration is not fixed to enable users to choose
language extensions.
In this paper we introduce an algorithm for
parse table
composition to support separate compilation of grammars to
parse table components. Parse table components can be
composed (linked) efficiently at runtime, i.e. just before
parsing. While the worst-case time complexity of parse table
composition is exponential (like the complexity of parse table
generation itself), for realistic language combination scenarios
involving grammars for real languages, our parse table composition
algorithm is an order of magnitude faster than computation of the
parse table for the combined grammars.
The experimental parser generator is available
online.
December 19, 2008 09:50
December 16, 2008
A few weeks ago an e-mail from Didier Garcin popped up on the Stratego mailing list. He explained that he had written a python script that could visualize an abstract syntax signature. The script was also send to the list and I finally had some time to check it out.
It turned out that I had almost anything installed to use the script, only the pydot dependency needed some work. This was mostly because the script only seems to work with the 0.9.10 version of this library. After getting the script started it was really simple to generate the signature from the PHP-Front grammar.
So, here is the one for PHP4:

And the one for PHP5:

The first thing I notices where the big rectangles in both versions. These rectangles are the statements (smaller one) and expressions grouped together. What I also notices was that in both versions we see that the bottom of the graph (which corresponds with the smallest units in the language) looks the most complicated. This corresponds very well with the amount of effort put into the modeling of this part of the language.
Comparing both grammars to each other we can see that the latest versions is the most complicated one. Furthermore, if we compare both graphs to the graphs shown in this post we can see that the Java-graph appears to be the most similar one.
Actually, I do not think these images show anything, but please explain it to me when you think I just don't see it. Anyway, at least we have some nice pictures now :)
P.S. for those who are interested, more detailed images (in svg-format) are available here.
December 16, 2008 21:36
December 15, 2008
Last Summer I attended the Code Generation 2008 conference in Cambridge to give a tutorial on WebDSL, as case study in domain-specific language engineering. The conference was an interesting change from the usual academic conferences I visit, in that the majority of the audience were from industry. It was good to see the interest in code generation in industry, but also disconcerting to observe the gap between academic research and industrial practice; but more about that some other time.

During the conference I was interviewed by
Laurence Tratt
for
Software Engineering Radio about
parsing.
The interview podcast recently appeared as
Episode 118.
It was a long time ago (1997) that I defended my PhD thesis, which was mostly about syntax definition and parsing.
In particular, I introduced SDF2, which radically integrates lexical and context-free syntax, and the SGLR parsing algorithm for parsing arbitrary 'character-level' context-free grammars.
Since finishing my thesis I have done quite a bit of `applied parsing research', using SDF and SGLR for applications such as
meta-programming with concrete object syntax and
DSL embedding, but I don't consider myself a hard-core parsing researcher any more.
So I had to dig deep in my memory to talk about Noam Chomsky's language hierarchy, grammars as string rewrite systems, and parsing algorithms. I find the result a bit awkward to listen to, but people assure me that is because it is my own voice I'm listening too.
In the meantime my relation to parsing is changing again.
While SDF/SGLR still provides the best approach to declarative definition of composite languages (in my opinion at least),
it has some fundamental limitations which have never been addressed.
A first step in addressing these limitations was taken in the SLE 2008 paper with Martin Bravenboer on parse table
composition (see upcoming blog) to provide separate compilation for grammars.
With a new PhD student starting in the new year, I hope to address other limitations such as the lack of error recovery.
December 15, 2008 21:11
December 12, 2008
The paper "Decorated Attribute Grammars" by Lennart Kats, Tony Sloane and Eelco Visser has been accepted for presentation at the International Conference on Compiler Construction (CC 2009) to be held in March 2009 in York (UK).
[pdf]

Abstract:
Attribute grammars are a powerful specification formalism for
tree-based computation, particularly for software language
processing. Various extensions have been proposed to abstract
over common patterns in attribute grammar
specifications. These include
various forms of copy rules to support non-local dependencies,
collection attributes, and expressing dependencies that are
evaluated to a fixed point. Rather than implementing
extensions natively in an attribute evaluator, we propose
attribute decorators that describe an abstract
evaluation mechanism for attributes, making it possible to
provide such extensions as part of a library of
decorators. Inspired by strategic programming, they are
specified using generic traversal operators. To demonstrate
their effectiveness, we describe how to employ decorators in
name, type, and flow analysis.
The ideas have been implemented in Aster, an extension of Stratego
with reference attribute grammars.
December 12, 2008 09:45
December 04, 2008
I'm working on the specification of pointer analysis for Java using Datalog. Basically, a pointer analysis computes for each variable in a program the set of objects it may point to at run-time.
For this purpose I need to express parts of the JVM Spec in Datalog as well. As a simple example, the following Datalog rules define when a class is a subclass of another class.
/**
* JVM Spec:
* - A class A is a subclass of a class C if A is a direct
* subclass of C
*/
Subclass(?c, ?a) <-
DirectSubclass[?a] = ?c.
/**
* JVM Spec:
* - A class A is a subclass of a class C if there is a direct
* subclass B of C and class A is a subclass of B
*/
Subclass(?c, ?a) <-
Subclass(?b, ?a),
DirectSubclass[?b] = ?c.
As you can see, this is remarkably close to the original specification (quoted in comments). You can clearly see the relationship between the spec and the code, even if you are not familiar with Datalog.
Recently, I was working on the specification of the checkcast instruction. This instruction performs the run-time check if an object can be cast to some type. The JVM Spec for checkcast first defines some variables:
The following rules are used to determine whether an objectref that
is not null can be cast to the resolved type: if S is the class of
the object referred to by objectref and T is the resolved class,
array, or interface type, checkcast determines whether objectref can
be cast to type T as follows:
So, this basically says that we're checking the cast (T)
S.
The first rule for this cast is straightforward:
If S is an ordinary (nonarray) class, then:
- If T is a class type, then S must be the same class as T, or a
subclass of T.
- If T is an interface type, then S must implement interface
T.
Well, if you're somewhat familiar with Java, or object-oriented
programming, then this part is obvious. Again, the specification in
Datalog is easy:
CheckCast(?s, ?s) <-
ClassType(?s).
CheckCast(?s, ?t) <-
Subclass(?t, ?s).
CheckCast(?s, ?t) -
ClassType(?s),
Superinterface(?t, ?s).
However, the next alternative in the specification is confusing:
If S is an interface type, then:
- If T is a class type, then T must be Object.
- If T is an interface type, then T must be the same interface as
S or a superinterface of S.
The specification is crystal clear, but how can S ever be an interface
type? S is the type of the object that is being cast, and how can an
object ever have a run-time type that is an interface? Of course, the
static type of an expression can be an interface, but we're talking
about the run-time here!
I searched
the web, which only resulted in a few hits. There was one question on a Sun forum years ago, where the one answer didn't make a lot of sense.
It turns out that this is indeed an `impossible' case. The reason why
this item is in the specification, is because checkcast is recursively
defined for arrays:
If S is a class representing the array type SC[], that is, an array of
components of type SC, then:
- ...
- If T is an array type TC[], that is, an array of components of
type TC, then one of the following must be true:
- ...
- TC and SC are reference types, and type SC can be cast to TC
by recursive application of these rules.
So, if you have an object of type List[] that is cast to
an Collection[], then the rules for checkcast get
recursively invoked for the types S = List and T =
Collection. Notice that List is an interface, but an object can
have type List[] at run-time. If have not verified this with the JVM
Spec maintainers, but as far as I can see, this is the only reason why
the rule for interface types is there.
Just to show a little bit more of my specifications, here is the rule
for the array case I just quoted from the JVM Spec:
CheckCast(?s, ?t) <-
ComponentType[?s] = ?sc,
ComponentType[?t] = ?tc,
ReferenceType(?sc),
ReferenceType(?tc),
CheckCast(?sc, ?tc).
Isn't it beautiful how this exactly corresponds to the formal
specification?
Unfortunately, even formal specifications can have errors, so I also
specified a large testsuite that checks the specifications with
concrete code. Here are some of the tests for CheckCast.
test Casting to self
using database tests/hello/Empty.jar
assert
CheckCast("java.lang.Integer", "java.lang.Integer")
test Casting to superclasses
using database tests/hello/Empty.jar
assert
CheckCast("java.lang.Integer", "java.lang.Number")
CheckCast("java.lang.Integer", "java.lang.Object")
test Cast ArrayList to various superinterfaces
using database tests/hello/Arrays.jar
assert
CheckCast("java.util.ArrayList", "java.util.List")
CheckCast("java.util.ArrayList", "java.util.Collection")
CheckCast("java.util.ArrayList", "java.io.Serializable")
test Cast class[] to implemented interface[]
using database tests/hello/Arrays.jar
assert
CheckCast("java.util.ArrayList[]", "java.util.List[]")
CheckCast("java.lang.Integer[]", "java.io.Serializable[]")
test Cast interface[] to superinterface[]
using database tests/hello/Arrays.jar
assert
CheckCast("java.util.List[]", "java.util.Collection[]")
The tests are specified in a little domain-specific language for
unit-testing Datalog that I implemented, initially for IRIS and later for LogicBlox. This tool is similar to
parse-unit,
a tool I wrote earlier for testing parsers in Stratego/XT. The concise syntax
of a test encourages you to write a lot of tests. Domain-specific
languages rock for this purpose!
December 04, 2008 15:03
July 11, 2008
The paper "WebWorkFlow: An Object-Oriented Workflow Modeling Language for Web Applications" by Zef Hemel, Ruben Verhaaf and Eelco Visser has been accepted for presentation at the conference on Model Driven Engineering Languages and Systems (MODELS 2008) to be held in Toulouse, France at the end of September 2008.
Abstract:
Workflow languages are designed for the high-level description
of processes and are typically not suitable for the generation
of complete applications.
In this paper, we present WebWorkFlow, an object-oriented
workflow modeling language for the high-level description of
workflows in web applications.
Workflow descriptions define procedures operating on domain
objects. Procedures are composed using sequential and
concurrent process combinators.
WebWorkFlow is an embedded language, extending WebDSL, a
domain-specific language for web application development, with
workflow abstractions.
The extension is implemented by means of model-to-model
transformations.
Rather than providing an exclusive workflow language,
WebWorkFlow supports interaction with the underlying WebDSL
language. WebWorkFlow supports most of the basic workflow
control patterns.
July 11, 2008 19:39
The paper "Heterogenous Coupled Evolution" by Sander Vermolen and Eelco Visser has been accepted for presentation at the conference on Model Driven Engineering Languages and Systems (MODELS 2008) to be held in Toulouse, France at the end of September 2008.
Abstract: As most software artifacts, meta-models can evolve. Their evolution
requires conforming models to co-evolve along with them. Coupled
evolution supports this. Its applicability is not limited to the
modeling domain. Other domains are for example evolving grammars or
database schema. Existing approaches to coupled evolution focus on a
single, homogeneous domain. They solve the co-evolution problems
locally and repeatedly. In this paper we will present a systematic,
heterogeneous approach to coupled evolution. It provides an
automatically derived domain specific transformation language; a means
of executing transformations at the top level; a derivation of the
coupled bottom level transformation; and the ability to generically
abstract from elementary transformations. The feasibility of the
architecture is evaluated by applying it to data model evolution as
well as grammar evolution.
July 11, 2008 19:35
July 01, 2008
This year is special. There is a new and exciting conference: the International Conference on Software Language Engineering (SLE). The deadline for submission of papers is July 14th, which is coming up soon! Before I start raving about the topics covered by this conference, here is the disclaimer: I'm on the program committee of this conference, and as such I believe it's my duty to advertise the conference.
Anyway, if done right, this conference has the potential to become a major and prestigious conference. The conference fills a clear gap: the topics of software language engineering do not exactly fit in major programming language conferences like OOPSLA, PLDI, POPL, and ECOOP. Nor do they fit exactly in the area of compiler construction (CC). CC does typically not accept more engineering or methodology-oriented papers. For OOPSLA and ECOOP the work more or less has to be in the context of object-oriented programming, for POPL it immediately has to be a principle (whatever that is), and for PLDI there are usually just a few slots available for papers that don't do something with memory management, garbage collection, program analysis, or concurrency. Personally, I've been pretty successful at getting papers in the area of software language engineering accepted at OOPSLA, but a full conference devoted to this topic is much better!
Another reason why I think that this conference has a lot of potential is that if I look at the list of topics of interest in the call for papers, then I can only think of one summary: everything that's fun! I'm convinced I'm not the only one who thinks these topics are fun. When talking to colleagues, I notice again and again most of us just love languages. The engineering of those languages is an issue for almost all computer scientists and many programmers in industry, and this conference will be the most obvious target for papers about this!
Also, the formalisms and used for the specification and implementation of (domain-specific) languages are still very much an open research topic. Standardization of languages is still far from perfect, as discussed by many posts on this blog. Also, new language implementation techniques are being proposed all the time, and extensible compilers for developing language extensions are more popular than ever. Not to mention the increasing interest in using domain-specific languages to help solve the software development problems we're facing.
Earlier in this post I wrote that this conference has major potential if done right. There are few risks. First, the conference has been started by two relatively small communities: ATEM and LDTA. I think the conference should attract a much larger community than the union of those two communities. I hope lots of people outside of the ATEM and LDTA communities will consider to submit a paper. Second, this year the conference is co-located with MODELS. Many programming language people are slightly allergic to model-driven engineering. I hope they will realize that this conference is not specifically a model-driven conference. Finally, the whole setup of the conference should be international and varied. I'm sorry to say that at this point I'm not entirely happy with the choice of keynote speakers. This nothing personal: I respect both keynote speakers, but the particular combination of the two speakers is a bit unfortunate. First, they are both Dutch. Second, neither of them is extremely well-known in the communities of OOPSLA, PLDI, or ECOOP. I hope that this will not affect the potential of this interesting conference.
Now go work on your submission!
July 01, 2008 08:02
June 03, 2008
I just got a call for papers for the International e-Conference on Computer Science 2008 (IeCCS 2008). The IeCCS conference organizers and committee members are one of a kind! The submission deadline for papers is the 20th of June. Notification of acceptance is 25th of June. That's 5 days for reviewing. The camera ready deadline is 27th of June. That's 2 days for revising your paper. You've got to love the efficiency of these people! The 2007 edition of this conference has a program of 11 pages of accepted papers. When I review papers, I rarely get more than 2 done per day. If the IeCCS committee want to accept a similar number of papers this year, then they'd better make sure to get enough coffee (or tea, as advised in my Ph.D. thesis).
Now, if you are a computer science researcher such emails are hardly surprising. I delete several of them every week. Everybody is aware of conferences with questionable reviewing practices (see the SCIgen paper generator). What surprised me about the IeCCS call for papers is that there is actually a researcher on the committee who I vaguely know from when I was a student. So, I searched the web a bit to see how obvious the evidence is that the reviewing practices of IeCCS are questionable. Interestingly, I could find only one reference that mentions IeCCS as a conference where you'd better not submit to. It's an interesting presentation of somebody at the PSU.
It seems that lists of conferences with a dubious reputation (also known as fake conferences) are impossible to keep up. I've seen a few lists in the past, but they've all disappeared. What interests me is why those lists are taken down. The most well known list, by Arlindo Oliveira, was taken down after receiving threats by conference organizers. I've never quite understood that: how serious can such a threat be? Maybe they'll publish a random paper with my name? They'll put me on the program committee next year?
So well, here we go. Let's see what happens.
Notice: IeCCS is not fake. It is very real!
June 03, 2008 07:14
May 26, 2008
The paper "The Nix Build Farm: A Declarative Approach to Continuous Integration" by Eelco Dolstra and Eelco Visser
has been accepted for presentation at the
International Workshop on Advanced Software Development Tools and Techniques
co-located with ECOOP 2008 in
The paper is the first to come out of the 3TU CEDICT
buildfarm project (which badly needs a webpage).
Abstract:
There are many tools to support continuous integration (the process of
automatically and continuously building a project from a version
management repository). However, they do not have good support for
variability in the build environment: dependencies such as compilers,
libraries or testing tools must typically be installed manually on all
machines on which automated builds are performed. The
Nix
package manager solves this problem: it has a purely functional
language for describing package build actions and their dependencies,
allowing the build environment for projects to be produced
automatically and deterministically. We have used Nix to build a
continuous integration tool, the
Nix build farm, that is in use
to continuously build and release a large set of projects.
May 26, 2008 20:44
May 24, 2008
This last week, I spent some free cycles hacking together a small project instantiation tool for Stratego/XT. It makes setting up a fresh Stratego project really simple by automatically populating the project space with a default directory layout, build system files and some minimal program and syntax samples.
To create a project p0, all you have to do is:
$ crap --new-project p0
This creates all the files necessary for a complete GNU Autotools-based build system, including a sample Stratego program (src/xmpl.str):
p0/
Makefile.am
README.Developer
README
AUTHORS
bootstrap
p0.spec.in
NEWS
p0.pc.in
configure.ac
ChangeLog
xmpl/
Makefile.am
syn/
Makefile.am
tests/
Makefile.am
src/
Makefile.am
xmpl.str
Once this is done, you can configure and compile the project,
$ ./bootstrap
$ ./configure
$ make all
install it,
$ make install
and even run the example transformation program:
$ echo "foo" | /usr/local/bin/xmpl
"Hello, World!"
The example program expects an input on stdin [or in a file specified by the -i switch], and will always produce the output string "Hello, World!".
The crap tool is part of the strategoxt-utils package. You can also download a stand-alone snapshot of crap.
More comprehensive documentation is available in the wiki. The tool is still very rough, so any suggestions for improvements and bug reports are very welcome.
May 24, 2008 14:40
May 13, 2008
Just got word that my proposal for an OOPSLA
tutorial has been accepted.
So make sure to register for it if you are planning to attend the conference.
Abstract: Implementing web applications in an object-oriented language such as
Java using state-of-the-art frameworks produces robust software, but
involves a lot of boilerplate code. Domain-specific languages (DSLs)
increase the productivity of software engineers by replacing such
low-level boilerplate code by high-level models, from which code can
be generated. This tutorial shows how to find domain-specific
abstractions based on patterns in existing (reference) programs and
build domain-specific languages to capture these abstraction using
several DSLs for
DSL engineering:
SDF
for syntax definition and
Stratego/XT
for code generation. The approach is illustrated using the
design and implementation of
WebDSL,
a domain-specific language for
web applications, which provides abstractions for data models, page
definitions,
access control,
workflow, and styling. The tutorial will
show how
code generation by model transformation
is an important technique for separation of concerns in DSL implementations
for designing a DSL as a tower of abstractions, rather than as a
monolithic language.
May 13, 2008 19:28
May 11, 2008
About fifteen months ago I announced
my ``Domain-Specific Language Engineering'' project that would result in a tutorial for the GTTSE'07 summerschool. "This tutorial gives an overview of all aspects of DSL engineering: domain analysis, language design, syntax definition, code generation, deployment, and evolution, discussing research challenges on the way. The concepts are illustrated with DSLs for web applications built using several DSLs for DSL engineering: SDF for syntax definition, Stratego/XT for code generation, and Nix for software deployment." A rather bold statement, since at time I didn't have a DSL, yet. But I did manage to design and implement a first version of WebDSL before the summerschool in July 2007. I also wrote a paper for the participants proceedings. That version discussed the design process from analyzing programming patterns in Seam/JSF/Java to the design and implementation of the DSL using SDF and Stratego. (I never got around to the deployment part; Sander van der Burg has by now developed Nix expressions for building and deploying a WebDSL application on a web server.)
Now I have finished the
final version of the paper
for the proceedings to be published by Springer.
The discussion of the WebDSL design and implementation has been much improved.
But more importantly, the paper has two introductory sections about the 'domain-specific language engineering' process
and three discussion sections evaluating WebDSL as a web engineering solution, discussing related DSL engineering approaches,
and research challenges for language engineering.
The resulting paper counts 85 pages (LNCS format) and 109 references.
And still, it is only scratching the surface.
There is so much more to say about DSL design and implementation.
The last couple of months I have been teaching a course on
program transformation and generation
in which we have explored paradigms for expressing analysis and transformation.
During this time I have been contemplating how to structure the course entitled "
model-driven software development"
that I'll be teaching next year to a much larger group of master's students.
I haven't resolved the issue yet.
But my project for the next 10 months will be to write a comprehensive set of lecture notes (let's call it a book) on
"
domain-specific language engineering"
covering methods and techniques for designing and implementing
software languages,
in particular, of course domain-specific languages. In all likelihood, languages for the web will again play a central role as examples.
E. Visser.
WebDSL: A Case Study in Domain-Specific Language Engineering.
In R. Laemmel, J. Saraiva, and J. Visser, editors,
Generative and Transformational Techniques in Software Engineering
(GTTSE 2007), Lecture Notes in Computer Science. Springer,
2008. Tutorial for International Summer School GTTSE 2007.
[
pdf]
Abstract:
The goal of domain-specific languages (DSLs) is to increase the
productivity of software engineers by abstracting from low-level
boilerplate code. Introduction of DSLs in the software development
process requires a smooth workflow for the production of DSLs
themselves. This requires technology for designing and implementing
DSLs, but also a methodology for using that technology. That is, a
collection of guidelines, design patterns, and reusable DSL components
that show developers how to tackle common language design and
implementation issues. This paper presents a case study in
domain-specific language engineering. It reports on a project in which
the author designed and built WebDSL, a DSL for web applications with
a rich data model, using several DSLs for DSL engineering: SDF for
syntax definition and Stratego/XT for code generation. The paper
follows the stages in the development of the DSL. The contributions of
the paper are three-fold. (1) A tutorial in the application of the
specific SDF and Stratego/XT technology for building DSLs. (2) A
description of an incremental DSL development process. (3) A
domain-specific language for web-applications with rich data models.
The paper concludes with a survey of related approaches.
May 11, 2008 14:18
May 08, 2008
-------------------------------------------------------------------------
Dear author,
We are pleased to inform you that your paper entitled
Building Program Optimizers with Rewriting [Strategies]
has been accepted for presentation at ICFP'98. In a next message, you
will receive reviews from the Program Committee that we hope you can
use to improve the final draft, which is due on July 14th. We will
also be sending you an ACM Copyright Release form, which must be
signed and returned by the same deadline.
Congratulations, and thank you for your submittal to ICFP'98.
Paul Hudak
Christian Queinnec
co-Chairs, ICFP'98 Program Committee
-------------------------------------------------------------------------
The text of an email from June 21, 1998 that announced the acceptance at ICFP'98 of the first paper on
Stratego written with Zino Benaissa and
Andrew Tolmach.
The implementation of the language for the paper marked the first version of the language and compiler.
The idea of traversal strategies had been done before embedded in ASF+SDF.
Just before the conference in September I managed to bootstrap the compiler.
The language did not have a name in that paper yet.
And the reviews were not very enthusiastic.
May 08, 2008 18:54
April 04, 2008
The paper "Declarative Access Control for WebDSL: Combining Language Integration and Separation of Concerns" by Danny Groenewegen and Eelco Visser has been accepted for presentation at the International Conference on Web Engineering (ICWE'08), which will be held in July 2008 in Yorktown Heights, New York. [pdf]
I'm especially proud of this acceptance as it is (1) based on the Master's thesis work of Danny Groenewegen,
and (2) the first paper about WebDSL to be accepted in the web engineering research community. (And also the first attempt; the other two papers featuring WebDSL appear in transformation venues.)
Abstract:
In this paper, we present the extension of WebDSL, a domain-specific
language for web application development, with abstractions for
declarative definition of access control. The extension supports the
definition of a wide range of access control policies concisely and
transparently as a separate concern. In addition to regulating the
access to pages and actions, access control rules are used to infer
navigation options not accessible to the current user, preventing the
presentation of inaccessible links. The extension is an illustration
of a general approach to the design of domain-specific languages for
different technical domains to support separation of concerns in
application development, while preserving linguistic integration. This
approach is realized by means of a transformational semantics that
weaves separately defined aspects into an integrated implementation.
April 04, 2008 20:27
March 19, 2008
The paper "Code Generation by Model Transformation" by Zef Hemel, Lennart Kats, and Eelco Visser was accepted for presentation at the International Conference on Model Transformation (ICMT'08).

Abstract: The realization of model-driven software development requires
effective techniques for implementing code generators. In this paper,
we present a case study of code generation by model transformation
with Stratego, a high-level transformation language based on the
paradigm of rewrite rules with programmable strategies that integrates
model-to-model, model-to-code, and code-to-code transformations. The
use of concrete object syntax guarantees syntactic correctness of code
patterns, and supports the subsequent transformation of generated
code. The composability of strategies supports two dimensions of
transformation modularity. Vertical modularity is achieved by
designing a generator as a pipeline of model-to-model transformations
that gradually transforms a high-level input model to an
implementation. Horizontal modularity is achieved by supporting the
definition of plugins which implement all aspects of a language
feature. We discuss the application of these techniques in the
implementation of WebDSL, a domain-specific language for dynamic web
applications with a rich data model.
March 19, 2008 11:14
March 18, 2008
Today we had a follow up on last weeks discussion about an attribute grammar extension of Stratego. By now, Nicolas Pierron has created a proper extension of Stratego with attribute equations and made a translation to basic Stratego in combination with the Transformers run-time extension for attribute evaluation support. Next up is the port of the copy rules generator that makes writing attribute equations much less verbose. In the meantime Lennart Kats is working on a JastAdd style implementation and Tony Sloane on a Eli-style (static scheduling) implementation. With these implementations in place we will be able to do some proper exploration of the combination of attribute evaluation and rewriting (strategies). I can't wait to make an implementation of the WebDSL typechecker using the attribute extension. To be continued.
March 18, 2008 11:21