Planet Stratego

You have a term and it don't look good, who're you gonna call?

June 10, 2009

Eelco Visser

Preventing injection attacks with syntax embeddings

Our paper on StringBorg is being published by Science of Computer programming:

M. Bravenboer, E. Dolstra, and E. Visser. Preventing Injection Attacks with Syntax Embeddings. A Host and Guest Language Independent Approach. Science of Computer Programming, 2009.
StringBorg is a technique for embedding 'string' languages in general purpose languages in a safe way, to avoid injection attacks. The paradigmatic example is the embedding of SQL queries, which typically is done using string literals as in the following example:
  String userName = getParam("userName");
  String password = getParam("password");
  String query = "SELECT id FROM users "
                      + "WHERE name = ’" + userName + "’ "
                      + "AND password = ’" + password + "’";
   if (executeQuery(query).size() == 0)
      throw new Exception("bad user/password");
In these approaches it is very easy to forget to escape SQL meta characters in the values obtained from the client. This opens the door to an attack through a query that escapes from the programmed query. StringBorg prevents such attacks by syntactically embedding the query language in the host language. For example, the query above can then be written as follows:
  SQL q = | SELECT id FROM users
                    WHERE name = ${userName} AND password = ${password} |>;
  if (executeQuery(q.toString()).size() == 0) ...
Now, the syntax of the query is checked statically. But more importantly, at run-time the query is constructed by a query API that ensures that the query constructed has the same syntactic structure as the one defined by the programmer. Furthermore, it enforces escaping meta-characters in values spliced into the query, thus guaranteeing that no injection attacks can occur.

The paper does not just provide a solution for embedding SQL in Java, but offers a generic approach for embedding any guest language in any host language with little more effort than providing syntax definitions for host and guest language.

Abstract: Software written in one language often needs to construct sentences in another language, such as SQL queries, XML output, or shell command invocations. This is almost always done using unhygienic string manipulation, the concatenation of constants and client-supplied strings. A client can then supply specially crafted input that causes the constructed sentence to be interpreted in an unintended way, leading to an injection attack. We describe a more natural style of programming that yields code that is impervious to injections by construction. Our approach embeds the grammars of the guest languages (e.g. SQL) into that of the host language (e.g. Java) and automatically generates code that maps the embedded language to constructs in the host language that reconstruct the embedded sentences, adding escaping functions where appropriate. This approach is generic, meaning that it can be applied with relative ease to any combination of context-free host and guest languages.

June 10, 2009 08:38

May 30, 2009

Eelco Visser

Stratego/XT Machine for Code Generation 2009

For the participants of our hands-on tutorial on Creating DSLs with Stratego/XT at the Code Generation 2009 conference, we (i.e. Rob Vermaas) created a VirtualBox image with all the software needed during the tutorial. In particular, it contains a full installation of Stratego/XT with compiler, libraries, and auxiliary packages such as java-front. In addition, the image contains a built-from-source installation of the WebDSL language for building web applications, and an installation of tomcat for deploying created web apps.

The image and instructions for its installation are available from
http://strategoxt.org/Stratego/CodeGeneration2009Tutorial

During the tutorial additional material will be handed out with concrete exercises.

The virtual machine is also useful for exploring Stratego/XT and WebDSL outside the context of this particular tutorial.

May 30, 2009 17:41

May 23, 2009

Eelco Visser

Domain-Specific Languages for Composable Editor Plugins

Spoofax/IMP is a toolset for the creation of interactive development environments for custom languages based on domain-specific languages for editor services. The toolset is especially aimed at the developers of domain-specific languages, allowing them to provide IDE support for their specialist language under development. An important feature of Spoofax/IMP is the support for language composition, i.e. for languages consisting of multiple, syntactically different, sub-languages. Furthermore, the toolset allows the customization of heuristically generated editor services without loosing the ability to regenerate these services when a language evolves.

At LDTA 2009 we presented a paper about Spoofax/IMP. The final version of that paper is now finished, and a pre-print is available.

L. C. L. Kats, K. T. Kalleberg, and E. Visser. Domain-Specific Languages for Composable Editor Plugins. In T. Ekman and J. Vinju, editors, Proceedings of the Ninth Workshop on Language Descriptions, Tools, and Applications (LDTA 2009), Electronic Notes in Theoretical Computer Science. Elsevier Science Publishers, April 2009. [pdf]

Abstract: Modern IDEs increase developer productivity by incorporating many different kinds of editor services. These can be purely syntactic, such as syntax highlighting, code folding, and an outline for navigation; or they can be based on the language semantics, such as in-line type error reporting and resolving identifier declarations. Building all these services from scratch requires both the extensive knowledge of the sometimes complicated and highly interdependent APIs and extension mechanisms of an IDE framework, and an in-depth understanding of the structure and semantics of the targeted language. This paper describes Spoofax/IMP, a meta-tooling suite that provides high-level domain-specific languages for describing editor services, relieving editor developers from much of the framework-specific programming. Editor services are defined as composable modules of rules coupled to a modular SDF grammar. The composability provided by the SGLR parser and the declaratively defined services allows embedded languages and language extensions to be easily formulated as additional rules extending an existing language definition. The service definitions are used to generate Eclipse editor plugins. We discuss two examples: an editor plugin for WebDSL, a domain-specific language for web applications, and the embedding of WebDSL in Stratego, used for expressing the (static) semantic rules of WebDSL.

May 23, 2009 11:29

May 10, 2009

Eelco Visser

Adding Error Recovery to Scannerless Generalized-LR Parsing

We just got the notification that our submission to OOPSLA 2009 has been accepted. The paper presents a solution to error recovery for the SGLR parsing algorithm. Here's the full citation and abstract (pre-print will follow later):

Lennart C. L. Kats, Maartje de Jonge, Emma Nilsson-Nyman, and Eelco Visser. "Providing Rapid Feedback in Generated Modular Language Environments. Adding Error Recovery to Scannerless Generalized-LR Parsing" In Gary T. Leavens, editor, Proceedings of the 24th ACM SIGPLAN Conference on Object-Oriented Programing, Systems, Languages, and Applications (OOPSLA 2009), New York, NY, USA, October 2009. ACM. (to appear).

Abstract: Integrated Development Environments (IDEs) increase programmer productivity, providing rapid, interactive feedback based on the syntax and semantics of a language. A heavy burden lies on developers of new languages to provide adequate IDE support. Code generation techniques provide a viable, efficient approach to semi-automatically produce IDE plugins. Key components for the realization of plugins are the language's grammar and parser. For embedded languages and language extensions, constituent IDE plugin modules and their grammars can be combined. Unlike conventional parsing algorithms, scannerless generalized-LR parsing supports the full set of context-free grammars, which is closed under composition, and hence can parse language embeddings and extensions composed from separate grammar modules. To apply this algorithm in an interactive environment, this paper introduces a novel error recovery mechanism, which allows it to be used with files with syntax errors -- common in interactive editing. Error recovery is vital for providing rapid feedback in case of syntax errors, as most IDE services depend on the parser -- from syntax highlighting to semantic analysis and cross-referencing. We base our approach on the principles of island grammars, and automatically generate new productions for existing grammars, making them more permissive of their inputs. To cope with the added complexity of these grammars, we adapt the parser to support backtracking. We evaluate the recovery quality and performance of our approach using a set of composed languages, based on Java and Stratego.

May 10, 2009 22:19

May 02, 2009

Eelco Visser

A Textual DSL for creating Visual Diagrams

I have been playing around for a couple of minutes with yUML, an online service by Tobin Harris for creating UML diagrams using a textual input language. The diagram below is generated while you load this page. The input needed to generate the diagram is the following list of relations:

  [Publication]++->*[Author], 
  [AbstractAuthor]^[Author], 
  [AbstractAuthor]*->1[Person], 
  [AbstractAuthor]*->1[Affiliation], 
  [Person]*->*[Publication], 
  [Publication]^[PrintPublication], 
  [PrintPublication]^[Article], 
  [PrintPublication]^[InProceedings], 
  [Publication]^[PublishedVolume], 
  [PublishedVolume]^[Proceedings], 
  [InProceedings]->[Proceedings], 
  [AbstractAuthor]^[Editor], 
  [PublishedVolume]++->*[Editor], 
  [Person]*->*[PublishedVolume], 
  [PublishedVolume]^[Book], 
  [PrintPublication]^[InCollection], 
  [Book]->*[InCollection] .
This diagram documents a (small) subset of the data model underlying the researchr.org application for bibliography sharing and reviewing.
*[Author], [AbstractAuthor]^[Author], [AbstractAuthor]*->1[Person], [AbstractAuthor]*->1[Affiliation], [Person]*->*[Publication], [Publication]^[PrintPublication], [PrintPublication]^[Article], [PrintPublication]^[InProceedings], [Publication]^[PublishedVolume], [PublishedVolume]^[Proceedings], [InProceedings]->[Proceedings], [AbstractAuthor]^[Editor], [PublishedVolume]++->*[Editor], [Person]*->*[PublishedVolume], [PublishedVolume]^[Book], [PrintPublication]^[InCollection], [Book]->*[InCollection] ." />*[Author], [AbstractAuthor]^[Author], [AbstractAuthor]*->1[Person], [AbstractAuthor]*->1[Affiliation], [Person]*->*[Publication], [Publication]^[PrintPublication], [PrintPublication]^[Article], [PrintPublication]^[InProceedings], [Publication]^[PublishedVolume], [PublishedVolume]^[Proceedings], [InProceedings]->[Proceedings], [AbstractAuthor]^[Editor], [PublishedVolume]++->*[Editor], [Person]*->*[PublishedVolume], [PublishedVolume]^[Book], [PrintPublication]^[InCollection], [Book]->*[InCollection] ." />*[Author], [AbstractAuthor]^[Author], [AbstractAuthor]*->1[Person], [AbstractAuthor]*->1[Affiliation], [Person]*->*[Publication], [Publication]^[PrintPublication], [PrintPublication]^[Article], [PrintPublication]^[InProceedings], [Publication]^[PublishedVolume], [PublishedVolume]^[Proceedings], [InProceedings]->[Proceedings], [AbstractAuthor]^[Editor], [PublishedVolume]++->*[Editor], [Person]*->*[PublishedVolume], [PublishedVolume]^[Book], [PrintPublication]^[InCollection], [Book]->*[InCollection] ." />

May 02, 2009 09:28

April 30, 2009

Eelco Visser

WebDSL: A Domain-Specific Language for Web Applications

I have been invited to give a talk at the Sixth International Workshop on Web Information Systems Modeling (WISM 2009), which will be held in June in Amsterdam (co-located with CAiSE 2009). Here's the abstract I wrote for the talk.

Abstract: In this talk I give an overview of the design and application of WebDSL, a domain-specific language for data centric web applications. WebDSL linguistically integrates the definition of data models, user interfaces, actions, access control rules, data validation rules, styling rules, and workflow definitions. While maintaining separation between these concerns through specialized sub-languages, linguistic integration ensures static consistency checking and correct code generation. The language allows developers to concentrate on the essential design of web applications, abstracting from accidental complexity, such as the details of data persistence. The combination of high-level and low-level constructs ensures high expressivity, while supporting customization to application requirements. The application of WebDSL is illustrated using the researchr.org application for bibliography sharing and reviewing.

Links:

April 30, 2009 09:38

April 13, 2009

Eelco Visser

Blog Abstraction (Twitter)

It was probably unavoidable. Tweetie app on iPod makes it easy. It seems I'm tweeting.

Blogging for the lazy. Let's see if I have more to say, or more frequently at least, on twitter than on this blog.

April 13, 2009 12:23

April 06, 2009

Eelco Visser

Creating Domain-Specific Languages with Stratego/XT

At the Code Generation 2009 conference, Lennart Kats and I will give a hands-on tutorial about building DSLs with Stratego/XT. The program says it thus:

Stratego/XT is a state-of-the art language and toolset for the development of domain-specific language implementations. In this session the participants learn to use Stratego/XT, by developing a generator for a small DSL for web applications generating PHP. The tutorial covers declarative syntax definition with SDF, code generation by model transformation, and model-to-model transformation by rewriting. The tutorial is based on a course in model-driven software development developed at Delft University of Technology.

NB Since this is a hands-on session places are strictly limited. Please let us know whether you plan to attend this session when you book your conference place. Places will be allocated on a first-come first-served basis.

April 06, 2009 20:35

March 22, 2009

Eric Bouwers

Migration to the nine-headed monster

After some recent activity on the psat-dev mailinglist I became aware of the (lack of) available builds for php-sat. Even though the development speed is not what I would want it to be (so much fun things to do, so little hours in a day!) I still believe it is important to release early and often.

Fortunately, Eelco Dolstra had some time to migrate php-front, php-sat and php-tools to Hydra, the new Nix-based continuous build system. After some tweaking we now again have access to unstable build for all PSAT-projects. Go Hydra!

March 22, 2009 13:10

March 05, 2009

Eelco Visser

Example-Driven Research

jan heering These days I am writing a book on ‘domain-specific language engineering’ for use in a master’s course at Delft University. The book is about the design and implementation of domain-specific languages, i.e. the definition of their syntax, static semantics, and code generators. But it also contains a dose of linguistic reflection, by studying the phenomenon of defining languages, and of defining languages for defining languages (which is called meta-modelling these days).

Writing chapters on syntax definition and modeling of languages, takes me back to my days as a PhD student at the University of Amsterdam. Our quarters were in the university building at theWatergraafsmeer, which was connected to the CWI building via a bridge. Since the ASF+SDF group of Paul Klint was divided over the two locations, meetings required a walk to the other end of the building. So, I would regularly wander to the CWI part of the building to chat. [While the third application we learned to use in our Unix course in 1989 was talk, with which one could synchronously talk with someone else on the internet (the first application was probably csh and the second email), face-to-face meetings were still the primary mode of communication; as opposed to the use of IRC to talk to one’s officemate.] Often I would look into Jan Heering’s office to say hi, and more often than not would end up spending the rest of the afternoon discussing research and meta-research.

One of the recurring topics in these conversations was the importance of examples. Jan was fascinated by the notion of ‘programming by example’, i.e. deriving a program from a bunch of examples of its expected behaviour, instead of a rigorous and complete definition for all cases. But the other use of examples was for validation, a word I didn’t learn until long after writing my thesis.

The culture of the day (and probably location?) was heavily influenced by mathematics and theoretical computer science. The game was the definition, preferably algebraically, of the artifacts of interest, and then, possibly proving interesting properties. The application minded would actually implement stuff. As a language engineer I was mostly interested in making languages with cool features. The motivation for these features was often highly abstract. The main test example driving much of the work on the ASF+SDF MetaEnvironment was creating an interactive environment for the Pico language (While with variable declarations). The idea being that once an environment for Pico was realized, creating one for a more realistic language would be a matter scaling up the Pico definition (mere engineering). Actually making a language (implementation) and using that to write programs would be a real test. To be fair, there were specifications of larger languages undertaken, such as ones of (mini-) ML [8] and Pascal [6]. As a student I had developed a specification of the syntax and static semantics of the object-oriented programming language Eiffel [16], but that was so big it was not usable at the Sun workstations we had at that time.

Time and again, Jan Heering would stress the importance of real examples to show the relevance of a technique and/or to discover the requirements for a design. While I thought it was a cool idea, I didn’t have examples. At least not to sell the design and implementation of SDF2, the syntax definition formalism that turned out to be the main contribution of my PhD thesis [20].
Continue reading "Example-Driven Research"

March 05, 2009 10:20

February 25, 2009

Eelco Visser

MODELS + GPCE + SLE in Denver

language engineers If you're doing research into domain-specific languages, model-driven engineering, or program generation, your agenda for the coming months is set. Early October the three main conferences on these topics are co-located in Denver. The deadlines are somewhat spread, so you should be able to submit a paper to each conference:

May 10: Model Driven Engineering Languages and Systems (MODELS'09)

May 18: Generative Programming and Component Engineering (GPCE'09)

July 10: Software Language Engineering (SLE 2009)

I'm looking forward to your submission, and to meeting you in Denver.

February 25, 2009 14:50

February 01, 2009

Karl Trygve Kalleberg

Domain-Specific Languages for Composable Editor Plugins

Lennart Kats, Eelco Visser and myself just got a paper accepted to LDTA'09. The paper is about declarative languages for describing programming editors. The main part of it is Lennart's work,  but it's running on top of the Spoofax transformation infrastructure. The idea is simple: You don't want to fight with Java, complicated APIs and complicated XML when you implement an Eclipse-based editor for your DSL. Instead, you describe your language's grammar with SDF, provide some auxiliary information using our declarative editor languages, and Spoofax/IMP does the rest by generating the editor engine for you.

The abstract explains it in the usual academic style:

Modern IDEs increase developer productivity by incorporating many different kinds of editor services. These can be purely syntactic, such as syntax highlighting, code folding, and an outline for navigation; or they can be based on the language semantics, such as in-line type error reporting and resolving identifier declarations. Building all these services from scratch requires both the extensive knowledge of the sometimes complicated and highly interdependent APIs and extension mechanisms of an IDE framework, and an in-depth understanding of the structure and semantics of the targeted language.

This paper describes Spoofax/IMP, a meta-tooling suite that provides high-level domain-specific languages for describing editor services, relieving editor developers from much of the  framework-specific programming. Editor services are defined as composable modules of rules coupled to a modular SDF grammar. The composability provided by the SGLR parser and the declaratively defined services allows embedded languages and language extensions to be easily formulated as additional rules extending an existing language definition. The service definitions are used to generate Eclipse editor plugins.

We discuss two examples: an editor plugin for WebDSL, a domain-specific language for web applications, and the embedding of WebDSL in Stratego, used for expressing the semantic rules of WebDSL.

Once I get bibtex-tools running on 64bit again, I'll link to the bib and pdf.

February 01, 2009 18:41

January 17, 2009

Karl Trygve Kalleberg

FOSDEM 2009

I'm going to FOSDEM

Like last year, I'm going to FOSDEM. Last year, I traveled from Paris with the express train (great experience). This year, it's back to planes again. Not looking forward to the security hysteria.

If you're interested in meeting me there, don't hesitate to fire off an e-mail. 

January 17, 2009 21:55

December 19, 2008

Eelco Visser

Parse Table Composition

As mentionted before, we've been doing some real parsing research to better support parsers for extensible languages. Parse table composition provides separate compilation for syntax components such that syntax extensions can be provided as plugins to a compiler for a base language. Due to various distractions last Summer I seem to have forgotten to blog about the paper that Martin Bravenboer and I got accepted at the first international conference on Software Language Engineering (which Martin was looking forward too).

M. Bravenboer and E. Visser. Parse Table Composition. Separate Compilation and Binary Extensibility of Grammars. In D. Gasevic and E. van Wyk, editors, First International Conference on Software Language Engineering (SLE 2008). To appear in Lecture Notes in Computer Science, Heidelberg, 2009. Springer. [pdf]

submittted Abstract: Module systems, separate compilation, deployment of binary components, and dynamic linking have enjoyed wide acceptance in programming languages and systems. In contrast, the syntax of languages is usually defined in a non-modular way, cannot be compiled separately, cannot easily be combined with the syntax of other languages, and cannot be deployed as a component for later composition. Grammar formalisms that do support modules use whole program compilation.

Current extensible compilers focus on source-level extensibility, which requires users to compile the compiler with a specific configuration of extensions. A compound parser needs to be generated for every combination of extensions. The generation of parse tables is expensive, which is a particular problem when the composition configuration is not fixed to enable users to choose language extensions.

In this paper we introduce an algorithm for parse table composition to support separate compilation of grammars to parse table components. Parse table components can be composed (linked) efficiently at runtime, i.e. just before parsing. While the worst-case time complexity of parse table composition is exponential (like the complexity of parse table generation itself), for realistic language combination scenarios involving grammars for real languages, our parse table composition algorithm is an order of magnitude faster than computation of the parse table for the combined grammars.

The experimental parser generator is available online.

December 19, 2008 09:50

December 16, 2008

Eric Bouwers

Visualizing the PHP grammar

A few weeks ago an e-mail from Didier Garcin popped up on the Stratego mailing list. He explained that he had written a python script that could visualize an abstract syntax signature. The script was also send to the list and I finally had some time to check it out.

It turned out that I had almost anything installed to use the script, only the pydot dependency needed some work. This was mostly because the script only seems to work with the 0.9.10 version of this library. After getting the script started it was really simple to generate the signature from the PHP-Front grammar.

So, here is the one for PHP4:
PHP4 AST visualization
And the one for PHP5:
PHP5 AST visualization

The first thing I notices where the big rectangles in both versions. These rectangles are the statements (smaller one) and expressions grouped together. What I also notices was that in both versions we see that the bottom of the graph (which corresponds with the smallest units in the language) looks the most complicated. This corresponds very well with the amount of effort put into the modeling of this part of the language.

Comparing both grammars to each other we can see that the latest versions is the most complicated one. Furthermore, if we compare both graphs to the graphs shown in this post we can see that the Java-graph appears to be the most similar one.

Actually, I do not think these images show anything, but please explain it to me when you think I just don't see it. Anyway, at least we have some nice pictures now :)

P.S. for those who are interested, more detailed images (in svg-format) are available here.

December 16, 2008 21:36

December 15, 2008

Eelco Visser

Talking about Parsing

Last Summer I attended the Code Generation 2008 conference in Cambridge to give a tutorial on WebDSL, as case study in domain-specific language engineering. The conference was an interesting change from the usual academic conferences I visit, in that the majority of the audience were from industry. It was good to see the interest in code generation in industry, but also disconcerting to observe the gap between academic research and industrial practice; but more about that some other time.

agent tratt During the conference I was interviewed by Laurence Tratt for Software Engineering Radio about parsing. The interview podcast recently appeared as Episode 118.

It was a long time ago (1997) that I defended my PhD thesis, which was mostly about syntax definition and parsing. In particular, I introduced SDF2, which radically integrates lexical and context-free syntax, and the SGLR parsing algorithm for parsing arbitrary 'character-level' context-free grammars. Since finishing my thesis I have done quite a bit of `applied parsing research', using SDF and SGLR for applications such as meta-programming with concrete object syntax and DSL embedding, but I don't consider myself a hard-core parsing researcher any more. So I had to dig deep in my memory to talk about Noam Chomsky's language hierarchy, grammars as string rewrite systems, and parsing algorithms. I find the result a bit awkward to listen to, but people assure me that is because it is my own voice I'm listening too.

In the meantime my relation to parsing is changing again. While SDF/SGLR still provides the best approach to declarative definition of composite languages (in my opinion at least), it has some fundamental limitations which have never been addressed. A first step in addressing these limitations was taken in the SLE 2008 paper with Martin Bravenboer on parse table composition (see upcoming blog) to provide separate compilation for grammars. With a new PhD student starting in the new year, I hope to address other limitations such as the lack of error recovery.

December 15, 2008 21:11

December 12, 2008

Eelco Visser

Decorated Attribute Grammars

The paper "Decorated Attribute Grammars" by Lennart Kats, Tony Sloane and Eelco Visser has been accepted for presentation at the International Conference on Compiler Construction (CC 2009) to be held in March 2009 in York (UK). [pdf]

lennart Abstract: Attribute grammars are a powerful specification formalism for tree-based computation, particularly for software language processing. Various extensions have been proposed to abstract over common patterns in attribute grammar specifications. These include various forms of copy rules to support non-local dependencies, collection attributes, and expressing dependencies that are evaluated to a fixed point. Rather than implementing extensions natively in an attribute evaluator, we propose attribute decorators that describe an abstract evaluation mechanism for attributes, making it possible to provide such extensions as part of a library of decorators. Inspired by strategic programming, they are specified using generic traversal operators. To demonstrate their effectiveness, we describe how to employ decorators in name, type, and flow analysis.

The ideas have been implemented in Aster, an extension of Stratego with reference attribute grammars.

December 12, 2008 09:45

December 04, 2008

Martin Bravenboer

Why the JVM Spec defines checkcast for interface types

I'm working on the specification of pointer analysis for Java using Datalog. Basically, a pointer analysis computes for each variable in a program the set of objects it may point to at run-time.

For this purpose I need to express parts of the JVM Spec in Datalog as well. As a simple example, the following Datalog rules define when a class is a subclass of another class.

/**
 * JVM Spec:
 * - A class A is a subclass of a class C if A is a direct 
 *   subclass of C
 */
Subclass(?c, ?a) <-
  DirectSubclass[?a] = ?c.

/**
 * JVM Spec:
 * - A class A is a subclass of a class C if there is a direct
 *   subclass B of C and class A is a subclass of B
 */
Subclass(?c, ?a) <-
  Subclass(?b, ?a),
  DirectSubclass[?b] = ?c.

As you can see, this is remarkably close to the original specification (quoted in comments). You can clearly see the relationship between the spec and the code, even if you are not familiar with Datalog.

Recently, I was working on the specification of the checkcast instruction. This instruction performs the run-time check if an object can be cast to some type. The JVM Spec for checkcast first defines some variables:

The following rules are used to determine whether an objectref that is not null can be cast to the resolved type: if S is the class of the object referred to by objectref and T is the resolved class, array, or interface type, checkcast determines whether objectref can be cast to type T as follows:

So, this basically says that we're checking the cast (T) S.

The first rule for this cast is straightforward:

If S is an ordinary (nonarray) class, then:
Well, if you're somewhat familiar with Java, or object-oriented programming, then this part is obvious. Again, the specification in Datalog is easy:
CheckCast(?s, ?s) <-
  ClassType(?s).

CheckCast(?s, ?t) <-
  Subclass(?t, ?s).

CheckCast(?s, ?t) -
  ClassType(?s),
  Superinterface(?t, ?s).
However, the next alternative in the specification is confusing:
If S is an interface type, then:

The specification is crystal clear, but how can S ever be an interface type? S is the type of the object that is being cast, and how can an object ever have a run-time type that is an interface? Of course, the static type of an expression can be an interface, but we're talking about the run-time here!

I searched the web, which only resulted in a few hits. There was one question on a Sun forum years ago, where the one answer didn't make a lot of sense.

It turns out that this is indeed an `impossible' case. The reason why this item is in the specification, is because checkcast is recursively defined for arrays:

If S is a class representing the array type SC[], that is, an array of components of type SC, then:

So, if you have an object of type List[] that is cast to an Collection[], then the rules for checkcast get recursively invoked for the types S = List and T = Collection. Notice that List is an interface, but an object can have type List[] at run-time. If have not verified this with the JVM Spec maintainers, but as far as I can see, this is the only reason why the rule for interface types is there.

Just to show a little bit more of my specifications, here is the rule for the array case I just quoted from the JVM Spec:

CheckCast(?s, ?t) <-
  ComponentType[?s] = ?sc,
  ComponentType[?t] = ?tc,
  ReferenceType(?sc),
  ReferenceType(?tc),
  CheckCast(?sc, ?tc).

Isn't it beautiful how this exactly corresponds to the formal specification?

Unfortunately, even formal specifications can have errors, so I also specified a large testsuite that checks the specifications with concrete code. Here are some of the tests for CheckCast.

test Casting to self
  using database tests/hello/Empty.jar
  assert
    CheckCast("java.lang.Integer", "java.lang.Integer")

test Casting to superclasses
  using database tests/hello/Empty.jar
  assert
    CheckCast("java.lang.Integer", "java.lang.Number")
    CheckCast("java.lang.Integer", "java.lang.Object")

test Cast ArrayList to various superinterfaces
  using database tests/hello/Arrays.jar
  assert
    CheckCast("java.util.ArrayList", "java.util.List")
    CheckCast("java.util.ArrayList", "java.util.Collection")
    CheckCast("java.util.ArrayList", "java.io.Serializable")

test Cast class[] to implemented interface[]
  using database tests/hello/Arrays.jar
  assert
    CheckCast("java.util.ArrayList[]", "java.util.List[]")
    CheckCast("java.lang.Integer[]", "java.io.Serializable[]")

test Cast interface[] to superinterface[]
  using database tests/hello/Arrays.jar
  assert
    CheckCast("java.util.List[]", "java.util.Collection[]")

The tests are specified in a little domain-specific language for unit-testing Datalog that I implemented, initially for IRIS and later for LogicBlox. This tool is similar to parse-unit, a tool I wrote earlier for testing parsers in Stratego/XT. The concise syntax of a test encourages you to write a lot of tests. Domain-specific languages rock for this purpose!

December 04, 2008 15:03

July 11, 2008

Eelco Visser

WebWorkFlow: An Object-Oriented Workflow Modeling Language for Web Applications

The paper "WebWorkFlow: An Object-Oriented Workflow Modeling Language for Web Applications" by Zef Hemel, Ruben Verhaaf and Eelco Visser has been accepted for presentation at the conference on Model Driven Engineering Languages and Systems (MODELS 2008) to be held in Toulouse, France at the end of September 2008.

Abstract: Workflow languages are designed for the high-level description of processes and are typically not suitable for the generation of complete applications. In this paper, we present WebWorkFlow, an object-oriented workflow modeling language for the high-level description of workflows in web applications. Workflow descriptions define procedures operating on domain objects. Procedures are composed using sequential and concurrent process combinators. WebWorkFlow is an embedded language, extending WebDSL, a domain-specific language for web application development, with workflow abstractions. The extension is implemented by means of model-to-model transformations. Rather than providing an exclusive workflow language, WebWorkFlow supports interaction with the underlying WebDSL language. WebWorkFlow supports most of the basic workflow control patterns.

July 11, 2008 19:39

Heterogenous Coupled Evolution

The paper "Heterogenous Coupled Evolution" by Sander Vermolen and Eelco Visser has been accepted for presentation at the conference on Model Driven Engineering Languages and Systems (MODELS 2008) to be held in Toulouse, France at the end of September 2008.

Abstract: As most software artifacts, meta-models can evolve. Their evolution requires conforming models to co-evolve along with them. Coupled evolution supports this. Its applicability is not limited to the modeling domain. Other domains are for example evolving grammars or database schema. Existing approaches to coupled evolution focus on a single, homogeneous domain. They solve the co-evolution problems locally and repeatedly. In this paper we will present a systematic, heterogeneous approach to coupled evolution. It provides an automatically derived domain specific transformation language; a means of executing transformations at the top level; a derivation of the coupled bottom level transformation; and the ability to generically abstract from elementary transformations. The feasibility of the architecture is evaluated by applying it to data model evolution as well as grammar evolution.

July 11, 2008 19:35

July 01, 2008

Martin Bravenboer

New Conference on Software Language Engineering

This year is special. There is a new and exciting conference: the International Conference on Software Language Engineering (SLE). The deadline for submission of papers is July 14th, which is coming up soon! Before I start raving about the topics covered by this conference, here is the disclaimer: I'm on the program committee of this conference, and as such I believe it's my duty to advertise the conference.

Anyway, if done right, this conference has the potential to become a major and prestigious conference. The conference fills a clear gap: the topics of software language engineering do not exactly fit in major programming language conferences like OOPSLA, PLDI, POPL, and ECOOP. Nor do they fit exactly in the area of compiler construction (CC). CC does typically not accept more engineering or methodology-oriented papers. For OOPSLA and ECOOP the work more or less has to be in the context of object-oriented programming, for POPL it immediately has to be a principle (whatever that is), and for PLDI there are usually just a few slots available for papers that don't do something with memory management, garbage collection, program analysis, or concurrency. Personally, I've been pretty successful at getting papers in the area of software language engineering accepted at OOPSLA, but a full conference devoted to this topic is much better!

Another reason why I think that this conference has a lot of potential is that if I look at the list of topics of interest in the call for papers, then I can only think of one summary: everything that's fun! I'm convinced I'm not the only one who thinks these topics are fun. When talking to colleagues, I notice again and again most of us just love languages. The engineering of those languages is an issue for almost all computer scientists and many programmers in industry, and this conference will be the most obvious target for papers about this!

Also, the formalisms and used for the specification and implementation of (domain-specific) languages are still very much an open research topic. Standardization of languages is still far from perfect, as discussed by many posts on this blog. Also, new language implementation techniques are being proposed all the time, and extensible compilers for developing language extensions are more popular than ever. Not to mention the increasing interest in using domain-specific languages to help solve the software development problems we're facing.

Earlier in this post I wrote that this conference has major potential if done right. There are few risks. First, the conference has been started by two relatively small communities: ATEM and LDTA. I think the conference should attract a much larger community than the union of those two communities. I hope lots of people outside of the ATEM and LDTA communities will consider to submit a paper. Second, this year the conference is co-located with MODELS. Many programming language people are slightly allergic to model-driven engineering. I hope they will realize that this conference is not specifically a model-driven conference. Finally, the whole setup of the conference should be international and varied. I'm sorry to say that at this point I'm not entirely happy with the choice of keynote speakers. This nothing personal: I respect both keynote speakers, but the particular combination of the two speakers is a bit unfortunate. First, they are both Dutch. Second, neither of them is extremely well-known in the communities of OOPSLA, PLDI, or ECOOP. I hope that this will not affect the potential of this interesting conference.

Now go work on your submission!

July 01, 2008 08:02

June 03, 2008

Martin Bravenboer

Dubious Conferences: How do they threat people?

I just got a call for papers for the International e-Conference on Computer Science 2008 (IeCCS 2008). The IeCCS conference organizers and committee members are one of a kind! The submission deadline for papers is the 20th of June. Notification of acceptance is 25th of June. That's 5 days for reviewing. The camera ready deadline is 27th of June. That's 2 days for revising your paper. You've got to love the efficiency of these people! The 2007 edition of this conference has a program of 11 pages of accepted papers. When I review papers, I rarely get more than 2 done per day. If the IeCCS committee want to accept a similar number of papers this year, then they'd better make sure to get enough coffee (or tea, as advised in my Ph.D. thesis).

Now, if you are a computer science researcher such emails are hardly surprising. I delete several of them every week. Everybody is aware of conferences with questionable reviewing practices (see the SCIgen paper generator). What surprised me about the IeCCS call for papers is that there is actually a researcher on the committee who I vaguely know from when I was a student. So, I searched the web a bit to see how obvious the evidence is that the reviewing practices of IeCCS are questionable. Interestingly, I could find only one reference that mentions IeCCS as a conference where you'd better not submit to. It's an interesting presentation of somebody at the PSU.

It seems that lists of conferences with a dubious reputation (also known as fake conferences) are impossible to keep up. I've seen a few lists in the past, but they've all disappeared. What interests me is why those lists are taken down. The most well known list, by Arlindo Oliveira, was taken down after receiving threats by conference organizers. I've never quite understood that: how serious can such a threat be? Maybe they'll publish a random paper with my name? They'll put me on the program committee next year?

So well, here we go. Let's see what happens.

Notice: IeCCS is not fake. It is very real!

June 03, 2008 07:14

May 26, 2008

Eelco Visser

The Nix Build Farm: A Declarative Approach to Continuous Integration

The paper "The Nix Build Farm: A Declarative Approach to Continuous Integration" by Eelco Dolstra and Eelco Visser has been accepted for presentation at the International Workshop on Advanced Software Development Tools and Techniques co-located with ECOOP 2008 in

The paper is the first to come out of the 3TU CEDICT buildfarm project (which badly needs a webpage).

Abstract: There are many tools to support continuous integration (the process of automatically and continuously building a project from a version management repository). However, they do not have good support for variability in the build environment: dependencies such as compilers, libraries or testing tools must typically be installed manually on all machines on which automated builds are performed. The Nix package manager solves this problem: it has a purely functional language for describing package build actions and their dependencies, allowing the build environment for projects to be produced automatically and deterministically. We have used Nix to build a continuous integration tool, the Nix build farm, that is in use to continuously build and release a large set of projects.

May 26, 2008 20:44

May 24, 2008

Karl Trygve Kalleberg

Create-a-Project: Creating Stratego/XT projects the simple way

This last week, I spent some free cycles hacking together a small project instantiation tool for Stratego/XT. It makes setting up a fresh Stratego project really simple by automatically populating the project space with a default directory layout, build system files and some minimal program and syntax samples.

To create a project p0, all you have to do is:

$ crap --new-project p0

This creates all the files necessary for a complete GNU Autotools-based build system, including a sample Stratego program (src/xmpl.str):

p0/
   Makefile.am
   README.Developer
   README
   AUTHORS
   bootstrap
   p0.spec.in
   NEWS
   p0.pc.in
   configure.ac
   ChangeLog
   xmpl/
        Makefile.am
   syn/
       Makefile.am
   tests/
         Makefile.am
   src/
       Makefile.am
       xmpl.str

Once this is done, you can configure and compile the project,

$ ./bootstrap
$ ./configure
$ make all

install it,

$ make install

and even run the example transformation program:

$ echo "foo" | /usr/local/bin/xmpl
"Hello, World!"

The example program expects an input on stdin [or in a file specified by the -i switch], and will always produce the output string "Hello, World!".

The crap tool is part of the strategoxt-utils package. You can also download a stand-alone snapshot of crap.

More comprehensive documentation is available in the wiki. The tool is still very rough, so any suggestions for improvements and bug reports are very welcome.

May 24, 2008 14:40

May 13, 2008

Eelco Visser

OOPSLA'08 tutorial: Building Domain-Specific Languages for the Web

Just got word that my proposal for an OOPSLA tutorial has been accepted. So make sure to register for it if you are planning to attend the conference.

Abstract: Implementing web applications in an object-oriented language such as Java using state-of-the-art frameworks produces robust software, but involves a lot of boilerplate code. Domain-specific languages (DSLs) increase the productivity of software engineers by replacing such low-level boilerplate code by high-level models, from which code can be generated. This tutorial shows how to find domain-specific abstractions based on patterns in existing (reference) programs and build domain-specific languages to capture these abstraction using several DSLs for DSL engineering: SDF for syntax definition and Stratego/XT for code generation. The approach is illustrated using the design and implementation of WebDSL, a domain-specific language for web applications, which provides abstractions for data models, page definitions, access control, workflow, and styling. The tutorial will show how code generation by model transformation is an important technique for separation of concerns in DSL implementations for designing a DSL as a tower of abstractions, rather than as a monolithic language.

May 13, 2008 19:28

May 11, 2008

Eelco Visser

WebDSL: A Case Study in Domain-Specific Language Engineering

About fifteen months ago I announced my ``Domain-Specific Language Engineering'' project that would result in a tutorial for the GTTSE'07 summerschool. "This tutorial gives an overview of all aspects of DSL engineering: domain analysis, language design, syntax definition, code generation, deployment, and evolution, discussing research challenges on the way. The concepts are illustrated with DSLs for web applications built using several DSLs for DSL engineering: SDF for syntax definition, Stratego/XT for code generation, and Nix for software deployment." A rather bold statement, since at time I didn't have a DSL, yet. But I did manage to design and implement a first version of WebDSL before the summerschool in July 2007. I also wrote a paper for the participants proceedings. That version discussed the design process from analyzing programming patterns in Seam/JSF/Java to the design and implementation of the DSL using SDF and Stratego. (I never got around to the deployment part; Sander van der Burg has by now developed Nix expressions for building and deploying a WebDSL application on a web server.)

Now I have finished the final version of the paper for the proceedings to be published by Springer. The discussion of the WebDSL design and implementation has been much improved. But more importantly, the paper has two introductory sections about the 'domain-specific language engineering' process and three discussion sections evaluating WebDSL as a web engineering solution, discussing related DSL engineering approaches, and research challenges for language engineering. The resulting paper counts 85 pages (LNCS format) and 109 references. And still, it is only scratching the surface.

There is so much more to say about DSL design and implementation. The last couple of months I have been teaching a course on program transformation and generation in which we have explored paradigms for expressing analysis and transformation. During this time I have been contemplating how to structure the course entitled "model-driven software development" that I'll be teaching next year to a much larger group of master's students. I haven't resolved the issue yet. But my project for the next 10 months will be to write a comprehensive set of lecture notes (let's call it a book) on "domain-specific language engineering" covering methods and techniques for designing and implementing software languages, in particular, of course domain-specific languages. In all likelihood, languages for the web will again play a central role as examples.

E. Visser. WebDSL: A Case Study in Domain-Specific Language Engineering. In R. Laemmel, J. Saraiva, and J. Visser, editors, Generative and Transformational Techniques in Software Engineering (GTTSE 2007), Lecture Notes in Computer Science. Springer, 2008. Tutorial for International Summer School GTTSE 2007. [pdf]

Abstract: The goal of domain-specific languages (DSLs) is to increase the productivity of software engineers by abstracting from low-level boilerplate code. Introduction of DSLs in the software development process requires a smooth workflow for the production of DSLs themselves. This requires technology for designing and implementing DSLs, but also a methodology for using that technology. That is, a collection of guidelines, design patterns, and reusable DSL components that show developers how to tackle common language design and implementation issues. This paper presents a case study in domain-specific language engineering. It reports on a project in which the author designed and built WebDSL, a DSL for web applications with a rich data model, using several DSLs for DSL engineering: SDF for syntax definition and Stratego/XT for code generation. The paper follows the stages in the development of the DSL. The contributions of the paper are three-fold. (1) A tutorial in the application of the specific SDF and Stratego/XT technology for building DSLs. (2) A description of an incremental DSL development process. (3) A domain-specific language for web-applications with rich data models. The paper concludes with a survey of related approaches.

May 11, 2008 14:18

May 08, 2008

Eelco Visser

Stratego: 10 years

-------------------------------------------------------------------------
Dear author,

We are pleased to inform you that your paper entitled

     Building Program Optimizers with Rewriting [Strategies]

has been accepted for presentation at ICFP'98. In a next message, you
will receive reviews from the Program Committee that we hope you can
use to improve the final draft, which is due on July 14th. We will
also be sending you an ACM Copyright Release form, which must be
signed and returned by the same deadline.

Congratulations, and thank you for your submittal to ICFP'98.

  Paul Hudak
  Christian Queinnec
  co-Chairs, ICFP'98 Program Committee
-------------------------------------------------------------------------
The text of an email from June 21, 1998 that announced the acceptance at ICFP'98 of the first paper on Stratego written with Zino Benaissa and Andrew Tolmach. The implementation of the language for the paper marked the first version of the language and compiler. The idea of traversal strategies had been done before embedded in ASF+SDF. Just before the conference in September I managed to bootstrap the compiler. The language did not have a name in that paper yet. And the reviews were not very enthusiastic.

May 08, 2008 18:54

April 04, 2008

Eelco Visser

Declarative Access Control for WebDSL

The paper "Declarative Access Control for WebDSL: Combining Language Integration and Separation of Concerns" by Danny Groenewegen and Eelco Visser has been accepted for presentation at the International Conference on Web Engineering (ICWE'08), which will be held in July 2008 in Yorktown Heights, New York. [pdf]

I'm especially proud of this acceptance as it is (1) based on the Master's thesis work of Danny Groenewegen, and (2) the first paper about WebDSL to be accepted in the web engineering research community. (And also the first attempt; the other two papers featuring WebDSL appear in transformation venues.)

Abstract: In this paper, we present the extension of WebDSL, a domain-specific language for web application development, with abstractions for declarative definition of access control. The extension supports the definition of a wide range of access control policies concisely and transparently as a separate concern. In addition to regulating the access to pages and actions, access control rules are used to infer navigation options not accessible to the current user, preventing the presentation of inaccessible links. The extension is an illustration of a general approach to the design of domain-specific languages for different technical domains to support separation of concerns in application development, while preserving linguistic integration. This approach is realized by means of a transformational semantics that weaves separately defined aspects into an integrated implementation.

April 04, 2008 20:27

March 19, 2008

Eelco Visser

Code Generation by Model Transformation

The paper "Code Generation by Model Transformation" by Zef Hemel, Lennart Kats, and Eelco Visser was accepted for presentation at the International Conference on Model Transformation (ICMT'08).

elated Abstract: The realization of model-driven software development requires effective techniques for implementing code generators. In this paper, we present a case study of code generation by model transformation with Stratego, a high-level transformation language based on the paradigm of rewrite rules with programmable strategies that integrates model-to-model, model-to-code, and code-to-code transformations. The use of concrete object syntax guarantees syntactic correctness of code patterns, and supports the subsequent transformation of generated code. The composability of strategies supports two dimensions of transformation modularity. Vertical modularity is achieved by designing a generator as a pipeline of model-to-model transformations that gradually transforms a high-level input model to an implementation. Horizontal modularity is achieved by supporting the definition of plugins which implement all aspects of a language feature. We discuss the application of these techniques in the implementation of WebDSL, a domain-specific language for dynamic web applications with a rich data model.

March 19, 2008 11:14

March 18, 2008

Eelco Visser

Attribute Grammars in Stratego II

Today we had a follow up on last weeks discussion about an attribute grammar extension of Stratego. By now, Nicolas Pierron has created a proper extension of Stratego with attribute equations and made a translation to basic Stratego in combination with the Transformers run-time extension for attribute evaluation support. Next up is the port of the copy rules generator that makes writing attribute equations much less verbose. In the meantime Lennart Kats is working on a JastAdd style implementation and Tony Sloane on a Eli-style (static scheduling) implementation. With these implementations in place we will be able to do some proper exploration of the combination of attribute evaluation and rewriting (strategies). I can't wait to make an implementation of the WebDSL typechecker using the attribute extension. To be continued.

March 18, 2008 11:21