Planet Stratego

You have a term and it don't look good, who're you gonna call?

July 08, 2010

Eelco Visser

TOOLS 2010 Portraits

I finally got back into the photo processing game by mastering selection with Adobe Bridge; first make a selection of photos and then only process the ones with highest rank; separation of duties. I shot some photos at last week's TOOLS 2010 conference, mostly during the dinners, since I was busy at other times. I uploaded the selection to a flickr set; here is a selection from the selection:

Jean Bezivin

Bertrand Meyer

Michael Ernst

Florian Heidenreich

Dennis Wagelaar

July 08, 2010 07:37

July 01, 2010

Eelco Visser

Lambdas in Spoofax

Today I've been participating in the Transformation Tool Contest 2010, which is held as part of the TOOLS 2010 conference in Malaga. The plan for my participation was the use of researchr for use in the evaluation of the results of the contest. Then it turned out that the example problem for the live part of the contest was evaluation of lambda terms. An excellent problem to tackle with Spoofax. So after doing some last improvements of researchr, I spent the afternoon creating a little lambda language environment.

The source code of the project is part of the Spoofax contributions directory.

Here is the Eclipse editor:

In the figure example.lam is the input for the transformation in concrete syntax; example.aterm is the abstract syntax for example.lam; example.nf.aterm is the result of normalizing example.aterm; finally, example.nf.lam is the result of pretty-printing the normal form. The result should be the correct addition of the lambda numeral for 2 with itself.

The syntax of the language is defined in SDF:
The Lambda.sdf module contains the definition of the concrete syntax. The Lambda.str module is automatically derived from Lambda.sdf and defines the abstract syntax (as an algebraic signature in Stratego).

The transformation is defined in Stratego. Here are the rules for pretty-printing and beta reduction:

Renaming rules are used to introduce the bindings of bound variables. The check rule verifies that no two definitions for the same name are given. The warning rule finds free variables and gives a warning in the editor.

Here's a screencast with a little demo to give an impression of the environment:

download

July 01, 2010 15:12

June 29, 2010

Eelco Visser

mobl

In my keynote talk at the Model-Driven Service Engineering workshop in Malaga (co-located with ICMT and TOOLS) I gave a presentation of mobl, the domain-specific language for mobile (web) application development that we (Zef Hemel) are developing.

June 29, 2010 14:58

June 21, 2010

Eelco Visser

Pure and Declarative Syntax Definition: Paradise Lost and Regained

"In the beginning were the words, and the words were trees, and the trees were words. All words were made through grammars, and without grammars was not any word made that was made. Those were the days of the garden of Eden. And there where language engineers strolling through the garden. They made languages which were sets of words by making grammars full of beauty. And with these grammars, they turned words into trees and trees into words. And the trees were natural, and pure, and beautiful, as were the grammars.

Among them were software engineers who made software as the language engineers made languages. And they dwelt with them and they were one people. The language engineers were software engineers and the software engineers were language engineers. And the language engineers made language software. They made recognizers to know words, and generators to make words, and parsers to turn words into trees, and formatters to turn trees into words.

But the software they made was not as natural, and pure, and beautiful as the grammars they made. So they made software to make language software and began to make language software by making syntax definitions. And the syntax definitions were grammars and grammars were syntax definitions. With their software, they turned syntax definitions into language software. And the syntax definitions were language software and language software were syntax definitions. And the syntax definitions were natural, and pure, and beautiful, as were the grammars."

Thus starts our essay on declarative syntax definition, which has been accepted for presentation at the Onward! 2010 conference:
Lennart C. L. Kats, Eelco Visser, Guido Wachsmuth. Pure and Declarative Syntax Definition: Paradise Lost and Regained. In Proceedings of Onward! 2010. ACM, 2010. (draft of final version)

Abstract Syntax definitions are pervasive in modern software systems, and serve as the basis for language processing tools like parsers and compilers. Mainstream parser generators pose restrictions on syntax definitions that follow from their implementation algorithm. They hamper evolution, maintainability, and compositionality of syntax definitions. The pureness and declarativity of syntax definitions is lost. We analyze how these problems arise for different aspects of syntax definitions, discuss their consequences for language engineers, and show how the pure and declarative nature of syntax definitions can be regained.

June 21, 2010 18:55

June 18, 2010

Eelco Visser

Building DSLs with Spoofax Language Workbench

The slides of my keynote talk at Code Generation 2010 with an appendix about model-to-model transformation and transformation strategies.

June 18, 2010 10:38

June 13, 2010

Eelco Visser

Strategies for Design & Implementation of Domain-Specific Languages

In the fifth lecture in the Model-Driven Software Development course at TU Delft I talked about strategies for design and implementation of domain-specific languages, illustrated with a sketch of the mapping from WebDSL to Java.

June 13, 2010 09:59

June 11, 2010

Eelco Visser

Encapsulating Software Platform Logic by Aspect-Oriented Programming

The Stratego transformation language was originally developed on and for the C/Linux platform. A couple of years ago we started porting the language and the Stratego/XT infrastructure to Java, which eventually enabled us to build the Spoofax Language Workbench. The back-end of the Stratego compiler was fairly unproblematic; the back-end for C is a bit over 1000 lines of code translating the compact Stratego Core language. Porting this was a matter of replacing the translation scheme. However, the extensive library, partly consisting on primitives implemented in C, depended on the Linux platform. To adapt the library to the Java platform, we extended Stratego with a simple aspect mechanism. Using override and proceed stragey definitions from libraries can be replaced with new definitions. The aspect extension and its application for achieving portability is discussed in the following paper (which has been accepted for the SCAM 2010 conference):

Lennart C. L. Kats, Eelco Visser. Encapsulating Software Platform Logic by Aspect-Oriented Programming: A Case Study in Using Aspects for Language Portability. In Cristina Marinescu, Jurgen J. Vinju, editors, Proceedings of the Tenth IEEE International Working Conference on Source Code Analysis and Manipulation 2010. 2010.

Abstract: Software platforms such as the Java Virtual Machine or the CLR .NET virtual machine have their own ecosystem of a core programming language or instruction set, libraries, and developer community. Programming languages can target multiple software platforms to increase interoperability or to boost performance. Introducing a new compiler backend for a language is the first step towards targeting a new platform, translating the language to the platform's language or instruction set. Programs written in modern languages generally make extensive use of APIs, based on the runtime system of the software platform, introducing additional portability concerns. They may use APIs that are implemented by platform-specific libraries. Libraries may perform platform-specific operations, make direct native calls, or make assumptions about performance characteristics of operations or about the file system. This paper proposes to use aspect weaving to invasively adapt programs and libraries to address such portability concerns, and identifies four classes of aspects for this purpose. We evaluate this approach through a case study where we retarget the Stratego program transformation language towards the Java Virtual Machine.

June 11, 2010 13:18

June 09, 2010

Eelco Visser

Web Abstractions

The third and fourth lectures in the course on model-driven software development that I teach at TU Delft. I talked about domain-specific abstractions in the domain of web programming, as incarnated in the design of WebDSL.

The material is based on the following publications:

June 09, 2010 23:15

June 08, 2010

Eelco Visser

Code Generation by Model Transformation

The journal version of our ICMT 2008 paper has finally and officially been published in the Software and Systems Modeling journal.

Zef Hemel, Lennart C. L. Kats, Danny M. Groenewegen, Eelco Visser. Code generation by model transformation: a case study in transformation modularity. Software and Systems Modeling, 9(3):375-402, June 2010.

The big contribution with respect to the earlier paper is a new approach to type checking with Stratego. Instead of defining a type checker as a single traversal that takes care of name resolution, type analysis, error checking and generation of error messages, these aspects are defined separately and can also be used separately. This allows a much cleaner style for defining type checkers and it enables a smooth combination of type analysis and normalizing transformations. The style also turned out to work very well in Spoofax setting where analyses such as name resolution are used for multiple purposes in the IDE.

The paper is published under the open access regime, so you can read it even without subscription to the journal.

Abstract The realization of model-driven software development requires effective techniques for implementing code generators for domain-specific languages. This paper identifies techniques for improving separation of concerns in the implementation of generators. The core technique is code generation by model transformation, that is, the generation of a structured representation (model) of the target program instead of plain text. This approach enables the transformation of code after generation, which in turn enables the extension of the target language with features that allow better modularity in code generation rules. The technique can also be applied to ‘internal code generation’ for the translation of high-level extensions of a DSL to lower-level constructs within the same DSL using model-to-model transformations. This paper refines our earlier description of code generation by model transformation with an improved architecture for the composition of model-to-model normalization rules, solving the problem of combining type analysis and transformation. Instead of coarse-grained stages that alternate between normalization and type analysis, we have developed a new style of type analysis that can be integrated with normalizing transformations in a fine-grained manner. The normalization strategy has a simple extension interface and integrates non-local, context-sensitive transformation rules. We have applied the techniques in a realistic case study of domain-specific language engineering, i.e. the code generator for WebDSL, using Stratego, a high-level transformation language that integrates model-to-model, model-to-code, and code-to-code transformations.

June 08, 2010 11:39

May 31, 2010

Eelco Visser

Domain Analysis & Data Modeling

Second lecture in the course on model-driven software development that I teach at TU Delft. I talked about domain analysis, interaction design, and domain-driven design.

May 31, 2010 20:30

May 28, 2010

Eelco Visser

Spoofax 0.5

We're pleased to announce the 0.5 release of the Spoofax language workbench, an Eclipse plugin that seamlessly integrates Java versions of Stratego and SDF into Eclipse. Spoofax can be used to develop new languages and transformations based on SDF and Stratego in the Eclipse environment.

Stratego and SDF have traditonally been implemented using C, but to increase portability we have developed Java versions of the Stratego compiler and the JSGLR parser for SDF. These new implementations are seamlessly integrated into the Spoofax environment, but can also be used as stand-alone tools.

IDE support has become essential for developers to be productive with programming languages. Spoofax provides IDE support for Stratego and SDF for developers of languages and transformations. It also aids in the development of IDE support for new languages: from the first version of an !SDF grammar, an editor can be created for the language and used side-by-side with the definition in Eclipse. Using Stratego, the editor can be enhanced with transformations and semantic editor services such as reference resolving and content completion. The screenshot below illustrates some of the IDE features supported by editors created with Spoofax:
Spoofax editor features

Spoofax can be downloaded from spoofax.org or strategoxt.org/Spoofax. When installed in Eclipse, the plugin provides a "New project" wizard that creates a new skeleton project illustrating some of the Spoofax features. The website also includes a tour further showcasing the features of the workbench. For migrating C-based Stratego projects to Spoofax, please read our FAQ or contact us in case of other questions.

An overview of the architecture of Spoofax and how Spoofax can be used in the development of new languages and IDE services is given in the paper The Spoofax Language Workbench. Rules for Declarative Specification of Languages and IDEs by Lennart Kats and Eelco Visser, accepted for publication at SPLASH/OOPSLA 2010. Further documentation can be found on the Spoofax website.

May 28, 2010 21:41

May 03, 2010

Eelco Visser

Performing Systematic Literature Reviews with Researchr

Eelco Visser. Performing Systematic Literature Reviews with Researchr: Tool Demonstration. Technical Report TUD-SERG-2010-010, Software Engineering Research Group, Delft University of Technology, Delft, The Netherlands, May 2010. [researchr]

In this paper, I describe the integrated workflow for performing systematic reviews with researchr, a web application for management of bibliographic data. Researchr semantically links publications to authors, journals, proceedings, and conferences, supporting reliable browsing. Publications can be classified using public (shared) tags. Researchr has over a million publication records, mainly in computer science. The core of the collection is based on the DBLP database (as provided via its XML export), but is extended with contributions from users. Researchr is open for contributions; users can contribute missing publications and can make corrections to publication records in the database and add missing information such as abstracts and citations. The quality of such modifications is guarded by a reputation system. Users can use researchr to provide a profile of their research with publications. More importantly, the site supports literature reviews, by creating bibiographies, collections of publications about a topic of choice. In this paper I describe the elements of the systematic reviewing workflow in researchr. Performing a systematic review consists of creating a bibliography, defining and executing a search strategy, defining classification schemes, and reviewing and classifying papers.

May 03, 2010 13:13

April 30, 2010

Eelco Visser

Model-Driven Software Development: Introduction

I'll be posting my slides from a course on model-driven software development that I'm teaching at Delft University of Technology. Here's the first, introducing the course.

April 30, 2010 19:35

April 12, 2010

Karl Trygve Kalleberg

English as a Programming Language, part I

Together with Einar, I’ve been deeply immersed in domain analysis for our upcoming trading language for the past month — i.e., analyzing the domain of trading. One of the crucial things we’ve been focusing on, is the rich vocabulary you are likely to hear when listening to traders working together. Their vocabulary captures how trading ideas are transmitted between humans. As the philosopher Gadamer might have argued, we think in words. And traders have invested time and effort into learning “the trading words”, because with language comes the knowledge and vice versa.

We want to capitalize on their investment by exposing the raw computing power of modern computers using a familiar language. If our target group were mathematicians, the language and notation of choice would be + - / * f(x), and all the other weird symbols we learned in school (and many we didn’t learn). If our target group were programmers, formal languages like Java or C++ would be an good starting point for inventing a new variant for trading.

But our future users are traders first, then a little bit of mathematician, then programmers as a distant third. With this in mind, we have done some serious consideration into using natural language as a basis for expressing trading rules. As part of this consideration, I did a (perhaps too short, but very enlightening) survey of the state of the art of using English as a programming language. I’m sharing my findings here, since they might be of broader interest.

Background

Let me give you a very concrete idea of what we want: Our motivation for using English as a programming language is to be able to support trading rules written almost exactly like regular English sentences, e.g.

if seeing more than 3 news headlines
   from distinct sources
   containing 'Hurricane' and 'US'
   within 1 minute:
    flatten positions

Again, our users (c.f. Takuya) are domain experts, but not programmers. They will mostly likely be reading code more often than they write it. They will not be coding every day, only now and again. They’re usually not interested in spending all that much time getting into that specific frame of mind for solving complex problems that regular programmers love so much (you know what I mean only too well, admit it).

As anybody will tell you, the English language is rather unbounded and pretty rife with ambiguities. I’ve come across a wide range of efforts to manage this problem, and I’ll share some highlights from the various efforts here.

Natural Language Programming

One of the major selling points of the Natural Language Programming technique is the ability to write technical documents shared by computers and humans. The technique exposes to its user an iterative process whereby rules for turning sentences into code are formulated. This multi-step process goes as follows:

  1. Define your taxonomy of concepts — i.e. names of interest in your domain. In our case: prices, instruments, pips, pricing, trends, support, resistance, etc.
  2. Define top-level sentences using said concepts, e.g. define support if bottom is at same level within X minutes.
  3. Define top-level sentences in terms of sequences of other sentences which also include said concepts.
  4. When you reach the most primitive sentences, you define a mapping from these to code in an (any) existing programming language, somewhat like a template system.
  5. Repeat until there is a mapping for all paths from the top-level sentences to code.

The example provided at Wikipedia is more detailed:

If  U_ is 'smc01-control',  then do the following.
Define surface weights Alpha as "[0.5, 0.5]". Initialise matrix Phi 
   as a 'unit matrix'.
Define J as the 'inertia matrix' of Spc01.
Compute matrix J2 as the inverse of J.
Compute position velocity error Ve and angular velocity error Oe 
   from dynamical state X, guidance reference Xnow.
Define the joint sliding surface G2 from the position velocity error 
   Ve and angular velocity error Oe using the surface weights Alpha.
Compute the smoothed sign function SG2 from the joint sliding 
   surface G2 with sign threshold 0.01.
Compute special dynamical force F from dynamical state  X and 
   surface weights Alpha.
Compute control torque T and control force U from matrix J2, surface 
   weights Alpha, special dynamical force F , smoothed sign function SG2.
Finish conditional actions.

The technique is supported by a system called System English. There is a page listing a few more examples, all of them in the style of the example above.

I must admit that this is a rather intriguing technique. My present reservation is that all the examples are rather verbose, and a dreary read. The documents (i.e., programs) are very repetitive, with a lot of sentences starting with either define or compute. This might be perfectly fine for technical documentation. I’ll not make any hasty judgments about its applicability to trading before actually having tested it for our domain, though.

Maude

It might come as a surprise to find Maude on this list, but let me explain why I included it. Maude is a language for equational and rewriting logic specification and programming. That sounds scary, but let’s focus on the parts of Maude which allow you build sentence fragments, the operators.

op add _ to _  : Num List -> List

This defines the operator (sentence fragment) add to. Assuming list is a list, the definition allows you to write

add 1 to list

which (provided that the add to operator does its job correctly), will add 1 to a list of numbers.

Since you are free to define any operator, you can also start defining your own mathematical operators

op _ + _ : Num Num -> Num
op _ * _ : Num Num -> Num

This allows you to write expressions a * b + c, as you might expect. There is one crucial point here, however. The precedence of the operators are not fixed. You are free to define these yourself. This means that the expression just given, will be ambiguous. It’s not (yet) defined whether it should be taken to mean (a * b) + c or a * (b + c). As you might guess, these are wrinkles that are rather important to iron out.

Maude comes to the rescue with a precedence annotation:

op _ + _ : Num Num -> Num [prec 35]
op _ * _ : Num Num -> Num [prec 25]

Think of this as a little bit of grammatic chemistry: the operator with the lower precedence will bind stronger to its arguments. With the precedence annotation in place, the expression a * b + c is now always interpreted as (a * b) + c, because * binds stronger than ‘+’.

The operator + brings another ambiguity to the table: associativity. When writing a + b + c, should it be taken to mean (a + b) + c or a + (b + c). The precedence cannot help us, since it will always be the same for the two “competing” pluses (plus competing against itself). As it happens, for the + operator, it really doesn’t matter. Both alternatives are the same. In mathematics, an operator with this property is called associative. Maude provides the [assoc] annotation for marking an operator with this property. Commutativity (that x op y is the same as y op x) is marked by [comm].

However, there may be cases with an operator is not associative, such as the divide operator. The Maude 2.0 Primer suggests that vs is also such an operator:

op _ vs _ Player Player -> Player [comm]

Let’s assume we’re back in the early 90s and that the following battle is at hand: Chun-Li vs Ryu vs Ken. If we read this is as (Chun-Li vs Ryu) vs Ken the winner of Chun-Li and Ryu will face Ken. In the interpretation Chun-Li vs (Ryu vs Ken), Chun-Li will face the winner of Ken and Ryu. Clearly not the same meaning.

Again, Maude comes to the rescue with an annotation: gather. Gather takes a number of flags equal to the number of arguments for your operator (in this case two). Each gather flag tells us something about the precedence of the operator found at  argument. A flag may be either of e (strictly less), E (less or equal), & (any). Confuzzled? Let’s try it out.

op _ vs _ Player Player -> Player [comm] [prec 33] [gather(e E)]

This defines that the first argument to vs must be of precedence strictly less than that of vs. Since the precedence of vs can never be strictly less than itself; (Chun-Li vs Ryu) vs Ken is therefore not allowed — the only valid interpretation left is Chun-Li vs (Ryu vs Ken). With that ambiguity out of the way, we can proceed to the real action. Fight!

Clearly, these mechanisms make Maude very expressive. In practice, Maude allows programmers to fairly freely define their own language with a rather natural looking syntax for any library they might want to design. The downside is that mastering Maude’s disambiguation mechanics is an art unto its own.

Action Semantics

Another little language with a very English-like syntax I’ve come across in previous travels is that of Peter D. Mosses’ formalism for defining semantics for programming languages. I.e., a “programming language” for defining the meaning of programming languages. The following, which tells how the identifier I is to be evaluated, should provide some flavor:

evaluate I = give the expressible bound to I 
             or 
             enact the abstraction bound to I

While not exactly in widespread use, it has been used to define semantics for some non-trivial languages, such as the dynamic semantics for Pascal.

Mosses and colleagues have written research tools to fully generate compilers based on language definitions in the action semantics formalism. The value proposition of Action Semantics is pretty clear: it allows you define your programming language at a high level, and then generate language tools such as interpreters and compilers (almost) for free — provided your language fits into the AS formalism.

Metafor

Metafor was a research prototype of a user interface for visualizing interactively typed stories as code (not available for general consumption). The idea was that users type English sentences describing their domain. Metafor then continually analyzes the typed story and maintains a “Python-view”. The Python view contains the structure of the code.

For example, when the user types the code:

There is a bar with a bartender who makes drinks.

Metafor would generate the following code:

class bar:
    the_bartender = bartender()
class bartender:
    def make(drink): pass

The “translation” from English to code is done using a slew of natural language processing tricks, including a natural language parser and a large database of  English “common sense” sentences from which “knowledge” about the various words is extracted.

As far as I know, the system was only ever able to generate the outline of your program. The larger, and perhaps more interesting, problem of writing program logic in English is still left as an exercise to the reader.

An improved version of Metafor be useful for doing structural/architectural design of code and classes, but the project seems discontinued. Fortunately, some of the appealing properties of Metafor are available in the Inform-series of adventure game/interactive fiction-creators.

Inform 7

The Inform system might be seen as a continuation of the good old Infocom-style of games, but on steroids and done really well. Inform is a design system for interactive fiction based on natural language. It comes with an “IDE” for developing interactive fiction, or text-based adventures, if you will.

You can easily define locations and entities:

The wood-slatted crate is in the Gazebo.
The crate is a container.
Mr Jones wears a top hat.
The crate contains a croquet mallet.

In the nomenclature of Inform, these are assertions. A number of built-in verbs are used to state assertions: to beto haveto carryto wearto contain and to support. It is possible for users to extend this set of verbs.

Based on the above sentences, the system will create the objects ‘crate’, ‘Mr Jones’ and ‘croquet mallet’. It will even place the croquet mallet inside the crate, so that the player must open it to find the mallet.

All in all, the system seems very well crafted and is the most impressive example of natural language programming I have seen to date. Again, I have no practical experience using it, so I cannot offer any advice on how expressive such and approach be might be for areas other than interactive fiction.

AppleScript

Perhaps the English-like language with the most widespread use, is that of AppleScript, as found on all Apple computers today. AppleScript was designed in the early 90s and is a descendent of the HyperCard hype of the late 80s. According to its creators, AppleScript is aimed at “casual programmers”. This includes programmers of all experience levels who are primarily out to solve problems, not write programs.

A simple AppleScript program for manipulating Excel:

tell application ‘‘ Excel ’’ on machine x
    put 3.14 into cell 1 of row 2 of window 1
end

Another script for doing a bit of cut-n-pasting:

copy the name of the first window of application ‘‘ Excel ’’
to the end of the first paragraph of app ‘‘ Scriptable Text Editor ’’

I came across this excellent report: AppleScript. William R. Cook, University of Texas at Austin With contributions from Warren Harris, Kurt Piersol, Dave Curbow, Donn Denman, Edmund Lai, Ron Lichty, Larry Tesler, Mitchell Gass & Donald Olson September 29, 2006.

This report is a really interesting read to anyone who considers using English as a programming language. To set the stage, let’s start with their conclusion:

Many of the current problems in AppleScript can be traced to the use of syntax based on natural language; however, the ability to create pluggable dialects may provide a solution in the future, by creating a new syntax based on conventional programming languages.

The English-like syntax, initially touted as a great way to capture the minds and souls of non-programmer and programmer alike, turned out to be the major Achilles heel of the language. Hmmm.

In the discussion, the authors go into a bit more detail:

The experiment in designing a language that resembled natural languages (English and Japanese) was not successful. It was assume that scripts should be presented in “natural language” so that average people could read and write them. […] In the end the syntactic variations and flexibility did more to confuse programmers than to help them out. The main problem is that AppleScript only appears to be a natural language. In fact is an artificial language, like any other programming language. […] It is easy to read AppleScript, but quite hard to write it.

When writing programs or scripts, users prefer a more conventional programming language structure. Later versions of AppleScript dropped support for dialects. In hindsight, we believe that AppleScript should have adopted the Professional Dialect that was developed but never shipped.

So, the main problem seems to stem from AppleScript not providing “English as we know it”. It has a surface syntax which reads pretty much like English. However, when trying express yourself in AppleScript, most of the grammar rules, idioms, proverbs and personal style you are accustomed to, is simply not there. Instead, you are forced to mould your ideas into a sort of Pidgin English, the AppleScript English, which doesn’t really behave like you expect. And the worst thing programmers know is when things don’t behave as they expect.

In retrospect, this might not come as a surprise. Many, if not most, programming languages are designed to be somewhat syntactically and grammatically minimal. There is often only one, or at most a few, ways of stating the basic things. Based on a set of core language constructs, you can build successively larger programs. But the set of core constructs behave rather simply and predictably. (Yes, Perl is a rather famous example of the opposite.)

The Osmosian Order

Osmosian Order is not as much a project as it appears to be a movement. According to their manifesto:

The Osmosian Order of Plain English Programmers is a group of like-minded developers and educators dedicated to the rescue of computer science from the pervasive fog of confusion engulfing it today.

They even have a plan:

Our initial goal is to see Plain English (and other natural-language variants, such as Plain Spanish and Plain German) adopted as de facto standard languages.

To that end, they have (allegedly) written a compiler and development environment for programming using Plain English. An example Plan English program is given below:

To create the background:
    Draw the screen's box with the white color.
    Loop.
    Pick a spot anywhere in the screen's box.
    Pick a color between the lightest gray color and the white color.
    Dab the color on the spot.
    If a counter is past 80000, break.
    If the counter is evenly divisible by 1000, refresh the screen.
    Repeat.
    Extract the background given the screen's box. 
       \or Create the background from the screen.
    Or something.

To finish a work:
    If the work is nil, exit.
    If the work is finished, exit.
    Create a picture given the work's URL.
    If the picture is nil, exit.
    Resize the picture to 5-1/2 inches by 5-1/2 inches.
    Center the picture in the screen's box.
    Draw the background.
    Draw the picture.
    Loop.
    Pick a spot anywhere near the picture's box.
    Mix a color given the spot.
    Dab the color on the spot.
    If a counter is past 20000, break.
    Repeat.
    Extract the work's painting given the picture's box.
    Destroy the picture.

Reputedly, The Order provides a fast compiler which is able to turn the stylized English above into an executable Windows program. The compiler isn’t freely downloadable, but their web page says you can get it by e-mail.

Tentative conclusion

Using English as a programming language is definitely not a no-brainer. There are some hard trade-offs to be juggled.

My present thinking is that if the price of predictability is the cost of learning a new language, people with extensive scripting needs will pay it gladly. If this cost is kept low by providing a set of core language constructs already known from other programming languages, getting started with scripting becomes trivial. While the ultimate goal will always be do-what-I-mean (and not as I think or say) computers, an intermediate step seems to be where most computer literates have a working knowledge of scripting. As long as most scripting languages resemble each other, it’s sort of like knowing any one of the Scandinavian languages — you can easily get by in all of Scandinavia knowing just one of them.

April 12, 2010 08:45

March 15, 2010

Eelco Visser

Talk about the Researchr Digital Library

At the recent IFIP WG 2.11 meeting in St Andrews, I gave a talk about researchr, the digital library application I have been working on in the past year. Here is the recording of that talk, which gives an introduction to the features of the tool.

Download: 768x576

March 15, 2010 11:04

March 14, 2010

Eelco Visser

Spoofax: The Language Workbench

At the recent IFIP WG 2.11 meeting in St Andrews, I gave a demonstration talk about the Spoofax language workbench. I made a recording. The talk is, eh, a little, eh, rough, but should be useful to give an impression of the basic features of Spoofax.

Download: 512x384, 768x576, 1024x768

March 14, 2010 22:20

March 13, 2010

Eelco Visser

The Unbearable Fragility of Software Configuration

Issue 52 of the Spoofax project in YellowGrass is a perfect illustration of the fragility of software configuration; the presence or absence of a line break makes the difference between a working and a failing system. What is worse is that the error is very hard to detect and requires a hunch from an expert. If the eclipse.ini file would have a proper syntax and static semantic constraints, the error could be caught right on load time, or preferably when editing the file in the first place. Configuration files are models in a domain-specific language that should be treated just like the classes of a Java program.

March 13, 2010 17:44

February 27, 2010

Eelco Visser

Publication Clouds

Previously researchr supported 'tag cloud' views for bibliographies providing a quick overview of the important and lesser important topics in a set of publications collected in a bibliography. Similarly, an 'author cloud' gives an overview of the authors of a bibliograpy, emphasizing authors with many papers.

Now clouds are not just provided for tags and authors, but also for year, publication type, and publication venue. Furthermore, cloud views are provided for all 'publication lists' in researchr: author profile, alias page (publications authored by author with same name), bibliography, tag, and also for the publications of the editions of a conference. The latter provides a nice overview of the community and topics of a conference series.

February 27, 2010 21:01

February 20, 2010

Eelco Visser

Coloring Language Aspects

I was looking for a way to illustrate the integration of different language aspects in WebDSL. After tedious experimentation with graphical editors to highlight the parts of a program, it dawned on me that the syntax coloring configuration of the Spoofax editor for WebDSL is the perfect tool for the job. Spoofax provides DSL for syntactic editor services, such as syntax coloring. The Spoofax generator, generates default configurations for the various services. For instance, for syntax coloring the following settings (left) are generated. To illustrate the role of data models, user interface templates, and expressions/actions I added the following customization of the default (right):

The result can be seen in the following fragments consisting of an entity declaration (data model) and a page definition with form and action:

I find it hard to choose good colors (e.g. yellow provides bad contrast on white), but this example nicely show how Spoofax allows assigning colors to syntactic categories, not just tokens.

February 20, 2010 20:17

February 19, 2010

Eelco Visser

WebDSL Applications

February 19, 2010 09:48

February 12, 2010

Eelco Visser

Building IDEs for Domain-Specific Languages with Spoofax/IMP

At the IFIP WG 2.11 meeting in March I'll give a talk on Spoofax/IMP; I'll try to record the talk and post it here.

Domain-specific languages are a key component of program generation. While we have ample experience building code generators and compilers, modern software developers expect integrated development environments such as Eclipse and Visual Studio to boost their productivity. To achieve the productivity gain promised by domain-specific languages, it is required that they come with strong IDE support. Since the production of DSLs cannot afford the effort that is put into IDEs for general-purpose languages, better tools are needed.

In this talk I present Spoofax/IMP, an Eclipse plugin for creating Eclipse plugins for custom (domain-specific) languages. Spoofax integrates several domain-specific languages for language definition. SDF supports modular, declarative definition of syntax with arbitrary context-free grammars integrating lexical syntax. A new implementation of the SGLR parser for SDF supports sophisticated error recovery. Stratego supports model transformation, code generation, static analysis, and refactoring with rewrite rules and programmable strategies. Editor service DSLs support declaration and customization of syntactic editor services such as syntax highlighting, folding, outline views, and semantic editor services such as error checking, cross references, content completion, and refactorings.

I'll introduce Spoofax by discussing a definition for a subset of the WebDSL web programming language. The talk will be partly slides and partly demonstration.

February 12, 2010 15:00

November 13, 2009

Eelco Visser

Composing Domain-Specific Languages

Eelco Visser Last week I was in Bergen (Norway) for the PhD thesis defense of Anya Bagge and the opening of the Bergen Language Design Laboratory. For the occasion of the opening a few people where asked to talk about the history and future of programming language design. Don Sannella talked about the failure of the Extended ML formal methods project. Bjarne Stroustrup talked about social issues of language design such as the (enormous) impact of C++ and the hassles of language standardization (and took my picture during the BLDL dinner). Paul Klint gave a flavour of his new Rascal meta-programming language. Horacio Bouzas and Carl Seger talked about the role of languages in oil exploration and chip design.

As a long time collaborator of the Bergen group, I also was asked to shine my light on language design. Instead of giving a straight up talk about Stratego or WebDSL, I decided to try to give a somewhat more general slant on the currently dominant theme of our research at TU Delft: composition of domain-specific languages. Today I extended the talk for delivery at the TU Eindhoven CS colloquium. I like the style of the talk and it gave rise to a good discussion about the trade-offs between internal and external DSLs afterwards. This is also the start of the development of a lectures series for my MDSD course in the Spring semester. Since there was an expression of interest, I've uploaded my slides. They're presentation-zenish, so there's not a whole lot of explanatory text and may not be useful without me talking about them. Feedback is welcome.

Abstract The history of programming languages shows a progressive development from low-level programming languages close to the machine, to high-level languages close to the problems being solved with software. Domain-specific languages take this a step further than general purpose programming languages by making assumptions about the class of applications for which the language is intended. Complete applications typically require programs in multiple (technical) domains, which can be catered for by separate domain-specific languages. While such separation of concerns is beneficial for domain expressivity, it often leads to loose coupling and lack of static verification. Hence, the design of individual DSLs needs to be complemented with their linguistic integration.

In this talk, I illustrate these ideas with the design of WebDSL, a domain-specific language for data centric web applications. WebDSL linguistically integrates the definition of data models, user interfaces, actions, access control rules, data validation rules, styling rules, and workflow definitions. While maintaining separation between these concerns through specialized sub-languages, linguistic integration ensures static consistency checking and correct code generation. The language allows developers to concentrate on the essential design of web applications, abstracting from accidental complexity, such as the details of data persistence. The combination of high-level and low-level constructs ensures high expressivity, while supporting customization to application requirements.

November 13, 2009 00:06

October 23, 2009

Eelco Visser

An Eclipse Editor for WebDSL

This afternoon I created an Eclipse editor for WebDSL from just its syntax definition using Spoofax/IMP in a matter of minutes. The editor provides syntactic editor services such as syntax highlighting, folding, outlining, error recovery; all out of the box. Here's a screenshot with an impression.

Note how my decision to include 'sections' in WebDSL turns out to play very nice with code folding. Spoofax/IMP also supports semantic error services such as displaying type check errors and cross-references, but that requires the WebDSL compiler to be refactored from a batch compiler to an incremental compiler.

October 23, 2009 23:16

October 18, 2009

Eelco Visser

Take the red PIL (or choose any other color you like)

The Platform Independent Language PIL now has its own website with a manual for the language and downloads of the compiler. You can use PIL as target for your domain-specific language and have it work on all PIL supported platforms. Currently only Java and Python are supported, but creation of PIL back-ends is cheap, probably much cheaper than making dedicated back-ends for your own DSL. You can get involved and contribute a new back-end for a new platform. To start we would like to have back-ends for PHP, C#, C, and Objective C, but any others are welcome too.

October 18, 2009 20:35

October 10, 2009

Eelco Visser

The Big Scheme of Things: Episode 2 -- Templates

While attending GPCE, SLE, and MoDELS in Denver, I recorded episode 2 of The Big Scheme of Things screencast series. Using a simple to-do application as example application, I illustrate various aspects of the use of WebDSL, a domain-specific language for web applications.

In this episode, I show the use of templates to divide a page definition into smaller fragments, the definition of action in templates, and the use of icons as actions.

Download: 512x384, 768x576, 1024x768

[This is as much an experiment in screencasting, as it is a WebDSL tutorial. Feedback is welcome. Thanks for the feedback on the previous episode. For this recording I have used a proper microphone (Rode Podcaster) instead of the built-in microphone of my MacBook; sound quality should be better. I still have to look into other formats than Quicktime; suggestions for useful formats are welcome. An iTunes feed is in the planning as well. There was a request for a start and end tune; apart from the requirement that such a tune should be copyright free, music is not my strong point; suggestions are welcome.]
Continue reading "The Big Scheme of Things: Episode 2 -- Templates"

October 10, 2009 13:53

October 05, 2009

Eelco Visser

Integration of Data Validation and User Interface Concerns in a DSL for Web Applications

One of the tricky aspects of getting a web application right, is data validation. That is, checking that inputs to forms is correct in some sense that is relevant for the application. For example, the input for a year field should be an integer that represents a year that sensible in the context of the application. Data validation actually goes beyond checking validity of single input fields in a form and includes invariants over the data model that should be maintained, and assertions that should hold at some point in processing input data.

The problem of data validation is two-fold. First, we'd like a declarative way to specify validation rules. Second, the checking of these rules should be integrated in the rest of the application. In particular, injecting error messages in the UI. In a paper that Danny Groenewegen will present this afternoon at the Software Language Engineering (SLE 2009) we describe how we have extended WebDSL with declarative data validation rules that are checked automatically and for which error messages are injected in the UI.

Danny M. Groenewegen, Eelco Visser. Integration of Data Validation and User Interface Concerns in a DSL for Web Applications. In Mark G. J. van den Brand, Jeff Gray, editors, Software Language Engineering, Second International Conference, SLE 2009, Denver, USA, October, 2009. Revised Selected Papers. Lecture Notes in Computer Science, Springer, 2009. [researchr]

Abstract: Data validation rules constitute the constraints that data input and processing must adhere to in addition to the structural constraints imposed by a data model. Web modeling tools do not address data validation concerns explicitly, hampering full code generation and model expressivity. Web application frameworks do not offer a consistent interface for data validation. In this paper, we present a solution for the integration of declarative data validation rules with user interface models in the domain of web applications, unifying syntax, mechanisms for error handling, and semantics of validation checks, and covering value well-formedness, data invariants, input assertions, and action assertions. We have implemented the approach in WebDSL, a domain-specific language for the defi- nition of web applications.

October 05, 2009 09:34

October 04, 2009

Eelco Visser

Quine In Stratego

Tonight I was reminiscing with Tijs van der Storm about conferences of the past, in particular the day that I wrote a Quine in Stratego at OOPSLA 2004. I reported about that in my pre-blog and at the Stratego/XT site, but not in this blog. Since it was a fun example, I thought it deserved a place on my official blog. Here goes:

After GPCE/OOPSLA in Vancouver Tijs van der Storm challenged me to write a Stratego program that prints its own source. So I set to work, with the following result. I'll make it educational and reconstruct the design process.

The basic idea

The core of a self printing program is that it should contain its own source and duplicate that. The following program implements this idea, but with 'prefix' and 'suffix' instead of the
----------------------------------
module quine 
imports lib
strategies
  main =
    !["prefix", "suffix"]
    ; \ [x, y] -> [x, x, y, y] \ 
    ; concat-strings
    ; (stdout, [])
    ;  0
-----------------------------------
Compiling and running this program produces:
prefixprefixsuffixsuffix

Quoting the source

Next we replace 'prefix' and 'suffix' by the prefix and suffix of the program with respect to the list. Note that we don't bother with the layout of the quoted fragment. Note that the backslashes of the anonymous rewrite rule need to be escaped in the string.
------------------------------------------------------------------------------------------
module quine 
imports lib
strategies
  main =
    !["module quine imports lib strategies main = ![", 
      "]; \\ [x, y] -> [x, x, y, y] \\; concat-strings; (stdout, []);  0"]
    ; \ [x, y] -> [x, x, y, y] \ 
    ; concat-strings
    ; (stdout, [])
    ;  0
------------------------------------------------------------------------------------------
The output of this program is
------------------------------------------------------------------
module quine imports lib strategies main = ![module quine 
imports lib strategies main = ![]; \ [x, y] -> [x, x, y, y] \; 
concat-strings; (stdout, []);  0]; \ [x, y] -> 
[x, x, y, y] \; concat-strings; (stdout, []);  0
------------------------------------------------------------------
which is starting to look good, but not quite there, since it doesn't compile.

Getting the quotes right

What we have to do now is introduce quotes in the printed program, such that the pieces of code in the list are actually parsed as strings.
------------------------------------------------------------------------------------------
module quine 
imports lib
strategies
  main =
    !["module quine imports lib strategies main = ![\"", 
      "\"]; \\ [x, y] -> [x, x, y, y] \\; concat-strings; (stdout, []);  0"]
    ; \ [x, y] -> [x, x, "\",\"", y, y] \ 
    ; concat-strings
    ; (stdout, [])
    ;  0
------------------------------------------------------------------------------------------
The output of this program is
------------------------------------------------------------------
module quine imports lib strategies main = !["module quine 
imports lib strategies main = !["",""]; \ [x, y] -> [x, x, y, y] \; 
concat-strings; (stdout, []);  0"]; \ [x, y] -> 
[x, x, y, y] \; concat-strings; (stdout, []);  0
------------------------------------------------------------------
which still does not compile because of the doublequotes embedded in the string.

Getting the quotes more right

We need to escape the embedded strings. This can be achieved easily by using the escape strategy from the library, which escapes doublequotes and backslashes.
------------------------------------------------------------------------------------------
module quine 
imports lib
strategies
  main =
    !["module quine imports lib strategies main = ![\"", 
      "\"]; \\ [x, y] -> [x, x, \"\\\",\\\"\", y, y] \\; concat-strings; (stdout, []);  0"]
    ; \ [x, y] -> [x, x, "\",\"", y, y] \ 
    ; concat-strings
    ; (stdout, [])
    ;  0
------------------------------------------------------------------------------------------
This produces (without the newlines)
-----------------------------------------------------------------------------
module quine imports lib strategies main = !["module quine imports 
lib strategies main = ![\"","\"]; \\ [x, y] -> [x, x,
 \"\\\",\\\"\", y, y] \\; concat-strings; (stdout
, []);  0"]; \ [x, y] -> [x, x, "\",\"", y, y] \; 
concat-strings; (stdout, []);  0
-----------------------------------------------------------------------------

Bootstrapping

Now the last program is not exactly the same as its predecessor, but if we compile and run it, it produces its own source literally.

October 04, 2009 21:18

October 03, 2009

Eelco Visser

Abstractions for Mobile Applications (Postdoc Position)

(Update: the position has been filled)

Yesterday, I had an interesting phone discussion with researchers from T-Mobile, which strengthened me in my plan to make the next MoDSE case study about abstractions for mobile applications, and to make that the focus for the open postdoc position in the project.

The Software Engineering Research Group of Delft University of Technology has a vacancy for a postdoc position in the “Delft University of Technology has a vacancy for a postdoc position in the “Delft University of Technology has a vacancy for a postdoc position in the “Delft University of Technology has a vacancy for a postdoc position in the “Model-Driven Software Evolution (MoDSE)” project funded by the Jacquard Software Engineering Research program of NWO.

Job description: The candidate will lead the next MoDSE case study in domain-specific language engineering into the domain of mobile applications. The case study will consist of finding high-level, platform independent abstractions for mobile applications and creating a collection of domain-specific languages covering the domain, building on the experience and tools from the WebDSL case study.

Requirements: The candidate should have a PhD degree in Computer Science with a strong background in software engineering, specifically software language engineering, model-driven engineering, and/or programming languages. Ideally the candidate has experience with language design and implementation.

Contract: 2 years, 8 months Salary: Maximum of €3755 per month gross depending on past experience. TU Delft offers an attractive benefits package, including a flexible work week, free high-speed Internet access from home, and the option of assembling a customized compensation and benefits package (the 'IKA'). Salary and benefits are in accordance with the Collective Labour Agreement for Dutch Universities.

How to apply: Send applications consisting of an application letter and a detailed CV in PDF by email to Eelco Visser

October 03, 2009 01:06

September 21, 2009

Eelco Visser

The Big Scheme of Things: Episode 1

WebDSL is a domain-specific language for building web applications. WebDSL avoids the boilerplate that is common in other web technologies by providing tuned domain-specific notations for the concerns of web development. Furthermore, the consistency of these concerns is verified at compile-time. WebDSL is available from webdsl.org. There you can also find instructions for installing the compiler and the other components that you need to run WebDSL applications. In this series of screencasts I will illustrate the use of WebDSL, by building a web application from scratch. The application we are going to build is called 'The Big Scheme of Things', a to-do list application. In this first episode I keep it really simple and create a single page app that allows us to add and remove tasks. First I show how to get an even simpler "Hello web!" app running.

Download: 512x384, 768x576, 1024x768

(This is as much an experiment in screencasting, as it is a WebDSL tutorial. Feedback is welcome.)
Continue reading "The Big Scheme of Things: Episode 1 "

September 21, 2009 08:14

September 09, 2009

Eelco Visser

PIL: A Platform Independent Language for Retargetable DSLs

zef hemel WebDSL is not the first domain-specific language I have designed and implemented. With each language the question is which platform to develop the language for. That is, which language should the compiler generate or should the interpreter be written in and which libraries to use with that language as run-time system. After the first generation of the ASF+SDF MetaEnvironment that was written in LeLisp, we decided in the 1990's that the second generation should target C as a fairly portable and ubiquitous 'assembly language'. Thus, the prototype implementation of the SGLR parser for SDF was written in C. For my next project, the transformation language Stratego, I also chose C as platform with the ATerm library as run-time system providing term representation and garbage collection. When I started work on WebDSL in 2007, Java and the JVM provided the best platform for web application development. Thus, Java is the target language generated from WebDSL applications. (The set frameworks used to create the run-time system is subject to change.)

SDF, Stratego, and WebDSL transcend the platforms they are implemented on. Each of the languages provides abstractions that are independent of the implementation platform and could easily be used on other platforms as well. However, creating back-ends for all sorts of platforms is a tedious and laborious undertaking, and then maintaining the consistency of these back-ends is worse. This problem is not unique to languages that I come up with, but is rather a universal problem for domain-specific languages.

To address this problem as part of our work on a general approach to the development and evolution of domain-specific languages, we are developing PIL, a platform independent language that abstracts from the differences between object-oriented languages. Back-ends that translate PIL to languages such as Java and Python are much smaller than the back-end of the WebDSL compiler, which encapsulates a lot of domain-specific implementation knowledge about the domain of web applications. With PIL we hope that we can not only develop domain-specific platform-independent abstractions, but also provide portable implementations for a variety of platforms. A paper about PIL has been accepted for the second conference on Software Language Engineering:

Zef Hemel, Eelco Visser. PIL: A Platform Independent Language for Retargetable DSLs. In Mark G. J. van den Brand, Jeff Gray, editors, Software Language Engineering, Second International Conference, SLE 2009, Denver, USA, October, 2009. Lecture Notes in Computer Science, Springer, 2009. (to appear) [researchr]

Abstract: Intermediate languages are used in compiler construction to simplify retargeting compilers to multiple machine architectures. In the implementation of domain-specific languages (DSLs), compilers typically generate high-level source code, rather than low-level machine instructions. DSL compilers target a software platform, i.e. a programming language with a set of libraries, deployable on one or more operating systems. DSLs enable targeting multiple software platforms if its abstractions are platform independent. While transformations from DSL to each targeted platform are often conceptually very similar, there is little reuse between transformations due to syntactic and API differences of the target platforms, making supporting multiple platforms expensive. In this paper, we discuss the design and implementation of PIL, a Platform Independent Language, an intermediate language providing a layer of abstraction between DSL and target platform code, abstracting from syntactic and API differences between platforms, thereby removing the need for platform-specific transformations. We discuss the use of PIL in an implemementation of WebDSL, a DSL for building web applications.

September 09, 2009 18:25