Planet Stratego
You have a term and it don't look good, who're you gonna call?
May 08, 2008
-------------------------------------------------------------------------
Dear author,
We are pleased to inform you that your paper entitled
Building Program Optimizers with Rewriting [Strategies]
has been accepted for presentation at ICFP'98. In a next message, you
will receive reviews from the Program Committee that we hope you can
use to improve the final draft, which is due on July 14th. We will
also be sending you an ACM Copyright Release form, which must be
signed and returned by the same deadline.
Congratulations, and thank you for your submittal to ICFP'98.
Paul Hudak
Christian Queinnec
co-Chairs, ICFP'98 Program Committee
-------------------------------------------------------------------------
The text of an email from June 21, 1998 that announced the acceptance at ICFP'98 of the first paper on
Stratego written with Zino Benaissa and
Andrew Tolmach.
The implementation of the language for the paper marked the first version of the language and compiler.
The idea of traversal strategies had been done before embedded in ASF+SDF.
Just before the conference in September I managed to bootstrap the compiler.
The language did not have a name in that paper yet.
And the reviews were not very enthusiastic.
May 08, 2008 18:54
April 04, 2008
The paper "Declarative Access Control for WebDSL: Combining Language Integration and Separation of Concerns" by Danny Groenewegen and Eelco Visser has been accepted for presentation at the International Conference on Web Engineering (ICWE'08), which will be held in July 2008 in Yorktown Heights, New York. [pdf]
I'm especially proud of this acceptance as it is (1) based on the Master's thesis work of Danny Groenewegen,
and (2) the first paper about WebDSL to be accepted in the web engineering research community. (And also the first attempt; the other two papers featuring WebDSL appear in transformation venues.)
Abstract:
In this paper, we present the extension of WebDSL, a domain-specific
language for web application development, with abstractions for
declarative definition of access control. The extension supports the
definition of a wide range of access control policies concisely and
transparently as a separate concern. In addition to regulating the
access to pages and actions, access control rules are used to infer
navigation options not accessible to the current user, preventing the
presentation of inaccessible links. The extension is an illustration
of a general approach to the design of domain-specific languages for
different technical domains to support separation of concerns in
application development, while preserving linguistic integration. This
approach is realized by means of a transformational semantics that
weaves separately defined aspects into an integrated implementation.
April 04, 2008 20:27
March 19, 2008
The paper "Code Generation by Model Transformation" by Zef Hemel, Lennart Kats, and Eelco Visser was accepted for presentation at the International Conference on Model Transformation (ICMT'08).

Abstract: The realization of model-driven software development requires
effective techniques for implementing code generators. In this paper,
we present a case study of code generation by model transformation
with Stratego, a high-level transformation language based on the
paradigm of rewrite rules with programmable strategies that integrates
model-to-model, model-to-code, and code-to-code transformations. The
use of concrete object syntax guarantees syntactic correctness of code
patterns, and supports the subsequent transformation of generated
code. The composability of strategies supports two dimensions of
transformation modularity. Vertical modularity is achieved by
designing a generator as a pipeline of model-to-model transformations
that gradually transforms a high-level input model to an
implementation. Horizontal modularity is achieved by supporting the
definition of plugins which implement all aspects of a language
feature. We discuss the application of these techniques in the
implementation of WebDSL, a domain-specific language for dynamic web
applications with a rich data model.
March 19, 2008 11:14
March 18, 2008
Today we had a follow up on last weeks discussion about an attribute grammar extension of Stratego. By now, Nicolas Pierron has created a proper extension of Stratego with attribute equations and made a translation to basic Stratego in combination with the Transformers run-time extension for attribute evaluation support. Next up is the port of the copy rules generator that makes writing attribute equations much less verbose. In the meantime Lennart Kats is working on a JastAdd style implementation and Tony Sloane on a Eli-style (static scheduling) implementation. With these implementations in place we will be able to do some proper exploration of the combination of attribute evaluation and rewriting (strategies). I can't wait to make an implementation of the WebDSL typechecker using the attribute extension. To be continued.
March 18, 2008 11:21
March 14, 2008
I finally figured out how to add proper navigation history support to Spoofax today. This one has been bugging me for quite some time. I remember spending far too much time diving through the documentation with the hopes of figuring out how this should be done properly. No luck.
Today I had a flash of inspiration, so I dug into the JDT code base. That code seemed to solve the same problem in a very complicated way, so I didn't want to copy their approach outright. Stymied, I started tracing exactly what happens with the navigation history when positions are placed into it. After a bit of fiddling around, I figured out that when I move the cursor, I should mark the position both before and after the cursor/focus moves to get the behaviour of JDT (which I tried to emulate). I've always only tried saving the editor location state either before I changed it, or afterwards. I also tried all kinds of alternative calls on the EditorPart hierarchy in vain. I now use ITextEditor.setHighlightRange() which appears to do the job, provided I call markInNavigatorHistory() "properly".
Anyway, the lesson is simple: if you call AbstractTextEditor.markInNavigationHistory(), remember to do it twice -- once before you change the editor/focus and once afterwards.
March 14, 2008 16:46
March 13, 2008
It's official: I'm the bootstrapper. My hacking life in the last few weeks have hardly been anything but bootsrapping. I've already said a few things about the Stratego compiler hacking. Since it takes ~3-4 hours for a full build of the Stratego compiler in the Delft buildfarm, I've had a couple of other projects to dive into in parallell. One of these has been the porting of Eclipse IMP from Eclipse 3.2 to 3.3.
In short, IMP is an IDE generator based on Eclipse. It provides set of plugins and wizards that makes the development of programming language environments (a lot) easier. The basic workflow when building an IDE for you favourite language with IMP is, (1) provide a grammar defined using the LPG grammar language, (2) use the IMP-provided wizards inside Eclipse to generate things like syntax highlighting support, outline support, code folding support, templates, text hovers, etc, then (3) fill in the skeletons provided by the generator. My personal view (subject to change without warning) of the generated code is that it's a guide to which parts of the Eclipse framework you need to extend in order to provide a given piece of functionality. Sort of a little helpful gnome pointing you in the right direction. In some cases, the generated code will actually do all you want, but more often than not, you will want to go beyond it.
That was the backgrounder on IMP. A major drawback of the current IMP releases is that they will only work on 3.2. Oh, and, of course, that IMP requires IMP to build IMP. Getting this beast ported to 3.3 wasn't as straightforward as I'd hoped. It took a few iterations. The first was getting it to build properly without any problems on my plain 3.2 installation. That took me several days. All kinds of subtle bugs surfaced, presumably because I have a different set of development habits than the IMPers.
Once those were patched and fixed upstream, I managed to bootstrap my first version on 3.2. An ensuing battle with race conditions in the startup code of various plug-ins followed. I hate static initializers, but apparently not everybody does. In a multi-plugin architecture where the order to plugin loading is not guaranteed, I cannot see how you can safely assume the order of static initializers across plugins, but those questions are not for me to ponder. I ripped them out, and replaced them with lazy initializers as far as possible, and that worked wonders. With that hurdle out of the way, it was all down hill: a couple of internal JFace and JDT classes had changed locations and APIs between 3.2 and 3.3, but it was quick enough to rewrite the offending code (another reason why depending on internal APIs is a bitch, though I realize that the features in question could not have been provided without doing so).
It's a huge disappointment to realize that my patches are only a couple of hundreds of lines. I felt like I had to rewrite the world, at places... Anyway, here's hoping to its inclusion in one of the pending releases. I've updated our sdf2imp tool to use the 3.3-based IMP, so we're already seeing a return on my investment:)
March 13, 2008 16:33
March 11, 2008
This morning I gave my very first lecture on attribute grammars, featuring Knuth's binary numbers AG (ripped from the jastadd.org site), and my own implementation of WebDSL data models in JastAdd. In the slides I explored the implementation of model of JastAdd by including the code it generates. While I had not added any JastAdd code since last week, the presentation further improved my understanding what is going in JastAdd. At the same time, I'm realizing I do not fully understand the design space we are considering, and the distinction between notational and essential differences between the various approaches.
In the afternoon,
Nicolas Pierron, a student from the Transformers group at Epita who is doing an internship in Delft, showed the code of his implementation of WebDSL typechecking in the Transformers SDF attribute grammar extension, which is implemented by generating Stratego code with some native code hacks. The hacks basically implement thunks of deferred attribute evaluations for nodes in the tree with pointers to dependent attribute value thunks. Attributes are then evaluated on demand, and thus no static scheduling is done. This approach is essentially the same as that of the encoding of attribute grammars in a lazy functional language such as Haskell (a hobby of my former colleagues in Utrecht). If we would just add heap-bound closures to Stratego, implementing this pattern would become that much easier.
The Transformers AG code operates on ATerms without sharing. Again, I'm not sure about the implications of this design choice. Are the Stratego libraries at all usable without maximal sharing? I guess I should ask Nicolas tomorrow.
To be continued.
March 11, 2008 21:41
A limitation of my previous stack tracing patches was that io-wrap and io-stream-wrap did not properly report traces on failure. The reason for this is easy to spot if we look at how the error is handled (this is where execution flow ends up when you call io-wrap):
option-wrap(opts, usage, about, announce, s) =
parse-options(opts, usage, about)
; announce
; (s; report-success + report-failure)
report-failure =
report-run-time
; <fprintnl> (stderr(), [ (), ": rewriting failed"])
; <exit> 1
As you can imagine, even though the program now happily prints a stack trace when the main strategy exits with a failure, it will not be printed when exit is called.
I've introduced a couple of stack introspection functions for dealing with this: stacktrace-get-current-frame-name returns the name of the current frame s, stacktrace-get-all-frame-names returns a list of all frame names and, stacktrace-get-current-frame-index returns integer that holds the current depth of the stack. These are actually implemented by primitives in the Stratego Standard Library (SSL).
A caveat of these strategies is that calling them will of course alter the stack. Even in the wonderful world of computing, we're not entirely free of Heisenbergian effects, apparently. However, there's a simple workaround: call the primitives directly, since this bypasses the way the compiler registers the stack frames.
With this trick in hand, I rewrote the two above strategies to include proper stack tracing for io-wrap:
option-wrap(opts, usage, about, announce, s) =
parse-options(opts, usage, about)
; announce
; (s; report-success + prim("SSL_stacktrace_get_all_frame_names") ; report-failure)
report-failure =
?stacktrace
; report-run-time
; <fprintnl> (stderr(), [ <whoami> (), ": rewriting failed, trace:"])
; <reverse ; map(<fprintnl> (stderr(), ["\t", <id>]))> stacktrace
; <exit> 1
Applying the modified io-wrap on the following sample program
main = io-wrap(my-wrap(foo))
my-wrap(s) = s
foo = debug(!"foo") ; bar
bar = debug(!"bar") ; fap ; zap
fap = debug(!"fap") ; id
zap = debug(!"zap") ; debug ; fail
gives
./prog: rewriting failed, trace:
main_0_0
io_wrap_1_0
option_wrap_5_0
lifted144
input_1_0
lifted145
output_1_0
lifted0
my_wrap_1_0
foo_0_0
bar_0_0
zap_0_0
Due to the compiler lifting inner strategies into freshly named, top-level strategies, the trace will contain some lifted* entries. Also, should you call strategies or rules which are compiled with older versions of the compiler, there will be "dark spots" in your trace. It won't be truncated -- only the frames due to the old library will be hidden.
March 11, 2008 13:05
March 10, 2008
Prompted by my visit to EPITA, I hacked together some very basic support for stack traces in Stratego that might come in handy when a Stratego program fails.
Here's a simple Stratego program, called prog (which, if you look at it closely, will always fail):
main = foo
foo = bar
bar = fap ; zap
fap = id
zap = fail
On the latest and greatest version of the compiler (build 17522 and later), you will get the following trace when this program is executed:
prog: rewriting failed, trace:
main_0_0
foo_0_0
bar_0_0
zap_0_0
There are a number of caveats with the tracing that I will try to get rid of, and, when there are only very hard problems left, explain myself out of, in a couple of future posts.
March 10, 2008 16:49
February 28, 2008
Earlier this week I was visiting the Programming Tools Group of
Oege de Moor in Oxford.
The visit was associated with the refactoring project in which I am an official collaborator.
The project aims at studying the high-level description of refactorings.
The starting point is to combine the strengths of program analysis with attribute grammars
and program transformation with (strategic) rewriting based approaches.
They are experimenting with the JastAdd-based Java compiler.
For me the project is interesting as we are considering to integrate attribute grammars in Stratego.
To start exploring this topic I've started writing an implementation of WebDSL in JastAdd together
with Torbjörn Ekman,
the main developer op JastAdd, who is a postdoc in Oxford.
We got to implement a basic version of the data model sub-language complete with modules
and code generation (using the StringTemplate library of Terence Parr).
I'm planning to
finish the implementation of the data model and use that as the basis for my lecture on JastAdd
in the program transformation course. (And then hand the implementation to the students who
can then add the implementation of (a subset of) the UI language.)
We had further discussions about the relation of between JastAdd and Stratego; although quite
different at first sight, there are interesting similarities (that I hadn't realised before). More to
follow about that in the future.
February 28, 2008 17:00
February 27, 2008
I visited Akim Demaille and his posse at EPITA today, and apparently there still is such as thing as free lunch (although, in my excitement over the good food, I kinda promised to help out with fixing some Stratego issues they are experiencing, so it was not entirely without entanglements).
I got to sit in on one of the bi-weekly status updates for the LRDE. The room numbered a little under 30 people, including students and faculty. They were kind enough to hold the meeting in English so that I could follow it. I found it surprising and very encouraging to have everybody report their progress (and, in a very few instances, lack thereof) in front of the entire lab. I've been missing this in many of the institutions I've been working at. It certainly increases the level of team feeling, and also makes it easier to uncover opportunities for collaboration between the various groups. For example, they all shared a lot of common infrastructure, including setups for newsgroups, a build farm, svn repos, etc.
I met two of the guys from the "previous" Transformers generation, Florian and Maxime. Florian was putting the finishing touches on a visualization tool for ambiguities in Transformers' attributed parse trees. It looked pretty sweet. Maxime was hacking a translator from a DSL for their Olena image processing library.
I also got to meet the new generation of Transformer students. I expect that I'll interact a lot more with them in the coming months, as they come to grips with Stratego.
February 27, 2008 17:55
February 21, 2008
My proposal for a tutorial on 'WebDSL: A Case Study in Domain-Specific Language Engineering' has been accepted by the organizers of Code Generation 2008. This conference emerged from the codegeneration.net site, which collects information about code generation techniques and tools.
As opposed to the conferences I usually visit, this one attracts quite a crowd from industry, I understand.
Last year's event was quite a success, according to attendees I talked to, so I'm looking forward to event in general,
and the opportunity to present Stratego/XT, SDF, and WebDSL to industry, in particular. Now think how to squeeze that into 75 min;)
February 21, 2008 21:27
February 13, 2008
The NWO/EZ Jacquard Software Engineering Program has granted the project
Pull Deployment of Services
for an amount of 368K Euro which should pay for a PhD student (4 years) and a postdoc (3 years).
In the project for which I am principal investigator, we collaborate
with Merijn de Jonge from Philips Research and the buildfarm project at TU Delft in which software
deployment expert Eelco Dolstra is postdoc. Here's the text from the proposal summary:
Hospitals are complex organizations, requiring the
coordination of specialists and support staff operating
complex medical equipment, involving large data sets, to take
care of the health of large numbers of patients.
The information technology infrastructure of hospitals is
heterogeneous and may consist of thousands of electronic
devices, ranging from workstations to medical equipment such
as MRI scanners. These devices are connected by wired and
wireless networks with complex topologies with different
security and privacy policies applicable to different nodes.
Software deployment in such a heterogeneous environment is
inherently difficult.
In order to make health-care professionals more effective and
deployment and maintenance more tractable, the hospital
information technology infrastructure is changing from a
device-oriented to a service-oriented environment, in which
the access to services is decoupled from the physical access
to particular devices.
In this project, we propose a pull model for service deployment
in which the components comprising a service are distributed
over nodes in the network, depending on the network topology,
properties of the application, and quality of service
requirements.
The goal of this project is to expand the state-of-the-art in
software deployment to support pull deployment of services.
In order to realize this goal we will conduct research in (1)
modeling of services and network architectures, (2) technology
for distributed deployment, and (3) tools for testing
implementations of distributed services.
We will build on our previous research in software deployment
(Nix) and model-based software development (Stratego/XT).
The project will be conducted in close collaboration with
Philips as industrial partner and will consist of a series of
experiments building prototype systems which implement service
distribution scenarios of increasing complexity.
February 13, 2008 10:37
February 10, 2008

I'm going to FOSDEM again this year. A bunch of old friends will be coming, so the opportunity is too good to pass up. Also, since I'll be in Paris at the time around FOSDEM, travel is both fast and reasonably cheap. (Three cheers for high speed trains.)
If you're interested in meeting me there, don't hesitate to fire off an e-mail. There's no Gentoo room this year, so I'll be hanging around elsewhere. I'm bound to drop by the Free Java devroom, for sure:) Another gang I'm anxious to meet again are the Nix people.
February 10, 2008 13:27
February 05, 2008
Today I started teaching a new master's course on Program Transformation & Generation at Delft University. The course studies techniques principles, techniques, applications of program transformation and generation. Using WebDSL as case study, several paradigms for implementing domain-specific languages will be studied, including term rewriting (Stratego), attribute grammars (Eli, JastAdd), and graph transformation. This is a departure from earlier courses I taught at Utrecht University about same subject. There I would spend a full quarter teaching on just Stratego/XT, which I felt was necessary to prepare master's students for a master project in this area. With the current state of documentation of Stratego/XT, it appears that such a in depth course is no longer necessary. At least, that is the experience with several (PhD) students who recently started developing Stratego applications succesfully without any prior training.
February 05, 2008 20:30
February 04, 2008
The paper Generating Editors for Embedded Languages. Integrating SGLR into IMP by Lennart Kats, Karl Trygve Kalleberg, and Eelco Visser has been accepted by Language Descriptions, Tools, and Applications (LDTA'08) to be held in Budapest, Hungary in April 2008 as part of ETAPS'08. The paper reports on the succesful integration of the SGLR parser in the IMP framework for building language-specific Eclipse plugins. Through this integration the capability of SDF/SGLR to support language embedings is extended to the IDE. This project is a first step towards generation of full fledged IDEs from SDF/Stratego language definitions. From the abstract:
Integrated Development Environments (IDEs) increase productivity by
providing a rich user interface and rapid feedback for a specific
language. Creating an editor for a specific language is not a trivial
undertaking, and is a cumbersome task even when working with an
extensible framework such as Eclipse. The IMP framework relieves the
IDE developer from a significant portion of the required work by
providing various abstractions for this. For embedded % domain-specific
languages, such as embedded regular expressions, SQL queries, or code
generation templates, its LALR parser generator falls short, however.
Scannerless parsing with SGLR enables concise, modular definition of
such languages. In this paper, we present an integration of SGLR into
IMP, demonstrating that a scannerless parser can be successfully
integrated into an IDE. Given an SDF syntax definition, the
\textsc{sdf2imp} tool automatically generates an editor plugin based
on the IMP API, complete with syntax checking, syntax highlighting,
outline view, and code folding. Using declarative domain-specific
languages, these services can be customized, and using the IMP
metatooling framework it can be extended with other features.
February 04, 2008 14:15
January 28, 2008
Martin recently got his PhD . It's very well deserved. I've seen first hand how serious and focused he's been for the last 4+ years.
Inspired by his didactical skills, I decided to rearrange my own dissertation page so that the individual chapters of my dissertation are easily downloadable.
Since I don't expect anybody to have neither the time nor the inclination to read the entire thesis from start to finish, Martin's idea of making it available as a split download makes a lot of sense.
Having done this, I got inspired to continue with spring (winter?) cleaning on a lot of other pieces of my PhD work.
I've set up an Ant Ivy repository for Spoofax. This means that you are now able to check out the various Spoofax subprojects from the source code repository and expect each subproject to compile separately, since all its dependencies will be fetched from my Ivy repo. Some of the subprojects require Eclipse. For those, you must run a script, fetch.sh, which will pick out the necessary jars from your Eclipse installation. It would best to have this repo hosted along with the rest of Stratego/XT, since it's definitely part of the Stratego/XT umbrella, but the new infrastructure in Delft is still being set up, I've been told.
Trying my hand as a webmonkey, I've decided to upload new Spoofax pages with a revamped design.
With those things out of the way, I'm now working on a reflection API for Stratego/J so that we may easily instantiate Java objects and call methods on them from Stratego scripts. This is needed for another project I'm cooking. However, I keep running into the lack of a fully interactive Stratego interpreter on the JVM, and that's a very itchy spot just now...;)
January 28, 2008 20:23
January 20, 2008
It has been awfully quiet here, I'm sorry about that. There are a few reasons for that. The first one is that I assembled my PhD thesis from my publications. This took quite some time and energy, but the result is great! My dissertation Exercises Free Syntax is available online. If you are interested in having dead tree version, just let me know!
I will defend my thesis tomorrow, January 21 (see the Dutch announcement). It's weird to realize that tomorrow is the accumulation of 4 years of working intensely!
For the library I created an English abstract. To give you an idea what the thesis is about, let me quote it here:
In modern software development the use of multiple software languages
to constitute a single application is ubiquitous. Despite the
omnipresent use of combinations of languages, the principles and
techniques for using languages together are ad-hoc, unfriendly to
programmers, and result in a poor level of integration. We work
towards a principled and generic solution to language extension by
studying the applicability of modular syntax definition, scannerless
parsing, generalized parsing algorithms, and program transformations.
We describe MetaBorg, a method for providing concrete syntax for
domain abstractions to application programmers. Since object-oriented
languages are designed for extensibility and reuse, the language
constructs are often sufficient for expressing domain abstractions at
the semantic level. However, they do not provide the right
abstractions at the syntactic level. The MetaBorg method consists of
embedding domain-specific languages in a general purpose host language
and assimilating the embedded domain code into the surrounding host
code. Instead of extending the implementation of the host language,
the assimilation phase implements domain abstractions in terms of
existing APIs leaving the host language undisturbed.
We present a solution to injection vulnerabilities. Software written
in one language often needs to construct sentences in another
language, such as SQL queries, XML output, or shell command
invocations. This is almost always done using unhygienic string
manipulation. A client can then supply specially crafted input that
causes the constructed sentence to be interpreted in an unintended
way, leading to an injection attack. We describe a more natural style
of programming that yields code that is impervious to injections by
construction. Our approach embeds the grammars of the guest languages
into that of the host language and automatically generates code that
maps the embedded language to constructs in the host language that
reconstruct the embedded sentences, adding escaping functions where
appropriate.
We study AspectJ as a typical example of a language conglomerate,
i.e. a language composed of a number of separate languages with
different syntactic styles. We show that the combination of the
lexical syntax leads to considerable complexity in the lexical states
to be processed. We show how scannerless parsing elegantly addresses
this. We present the design of a modular, extensible, and formal
definition of the lexical and context-free aspects of the AspectJ
syntax. We introduce grammar mixins, which allows the declarative
definition of keyword policies and combination of extensions.
We introduce separate compilation of grammars to enable deployment of
languages as plugins to a compiler. Current extensible compilers focus
on source-level extensibility, which requires users to compile the
compiler with a specific configuration of extensions. A compound
parser needs to be generated for every combination. We introduce an
algorithm for parse table composition to support separate compilation
of grammars to parse table components. Parse table components can be
composed (linked) efficiently at runtime, i.e. just before
parsing. For realistic language combination scenarios involving
grammars for real languages, our parse table composition algorithm is
an order of magnitude faster than computation of the parse table for
the combined grammars, making online language composition feasible.
Also, they asked me for a Dutch, non-technical summary for news websites. For my Dutch readers:
We presenteren een verzameling van methoden en technieken om
programmeertalen te combineren. Onze methoden maken het bijvoorbeeld
mogelijk om in een programmeertaal die ontworpen is voor algemene
doeleinden een subtaal te gebruiken die beter aansluit bij het domain
van een bepaald onderdeel van een applicatie. Hierdoor kan een
programmeur op een duidelijkere en compactere wijze een aspect van de
software implementeren.
Op basis van dezelfde technieken presenteren we een methode die
programmeurs beschermt tegen fouten die de oorzaak zijn van het meest
voorkomende beveiligingsprobleem, een zogenaamde injectie aanval. Door
op een iets andere wijze te programmeren, heeft de programmeur de
garantie dat de software niet gevoelig is voor dergelijke
aanvallen. In tegenstelling tot eerder voorgestelde oplossingen geeft
onze methode absolute garanties, is eenvoudiger voor de programmeur,
en kan gebruikt worden voor alle gevallen waarin injectie aanvallen
kunnen voorkomen (bijvoorbeeld niet specifiek voor de taal SQL).
Tot slot maken onze technieken het mogelijk om de syntaxis van sommige
programmeertalen duidelijker en formeler te definieren. Sommige
moderne programmeertalen zijn eigenlijk een samensmelting van
verschillende subtalen (zogenaamde taalagglomeraten). Van dergelijke
talen was het tot nu toe onduidelijk hoe de syntaxis precies
geformuleerd kon worden, wat voor standaardisering en compatibiliteit
noodzakelijk is.
January 20, 2008 12:08
December 24, 2007
With my comp.sci PhD finished, printed and published, I'm now back to being a full-time medical student. It's really rewarding and fun -- being an introvert geek, I've learned a lot about how I relate to other people by being shoved into a room with a patient who expects me to talk to him/her about the most intimate details of his/her situation. Alas, being a student doesn't pay at all.
I've still some money left from earlier jobs, and I live quite comfortably, but prudence (and interest) requires me to look for a job this summer as well. If all pending exams go well, I'll get my temporary license at the end of the spring semester, which means I can apply for work as a hospital doctor during the summer. There's no denying that this would be quite a lot of fun, but I'm not all that hopeful -- the competition to get hospital jobs seems fierce, and I've not been a star performer when it comes to medicine, I'm sad to say (but hopefully things will pick up now that I can focus on it).
For these reasons, I'll probably also be looking around for comp.sci jobs as a backup because I'm fairly good at it, it pays well, and it's usually a lot of fun. Also, it's easier to get jobs abroad, even in countries where you don't speak the native language fluenty:)
December 24, 2007 01:42
December 23, 2007
Since my last blog post I have been pretty busy teaching a course on programming languages, and developing a web application for the webdsl.org site. It is now finally online. We presented the first release of WebDSL last Thursday using a presentation embedded in the webdsl.org site in the MoDSE Colloquium. As should be the case, the database crashed midway the presentation (probably due to a corruption of the filesystem of the virtual machine we were using), but after a short break and some frantic hacking we got the presentation back on track. By now the site is available outside the TU Delft firewall. The site includes a wiki (with page history), blogs, forums, news items, and an issue tracker. Proper user management should avoid the user registration spam disaster that we experienced with previous (T)wikis. The aim is to evolve the application into a full blown software project management and community site that should be usable by other projects as well.
For starters, I am now working on migrating the Stratego/XT and Program Transformation wikis to (clones of) the webdsl.org application. While usable and useful, the application can be improved in numerous ways.
December 23, 2007 11:38
December 08, 2007
It's been rather quiet on the northern front for quite some time. I've been mostly busy with diagnosing old ladies with chest pain of late, and trying to make heads and tails of the horrible electronic health record system at the hospital. Sheesh.
Anyway, today I found time to do some compiler hacking. It feels great, as always! I resurrected the strc-java project -- a Java backend for the Stratego compiler. After a couple of hours of fiddling around, I now have an extremely rudimentary runtime up and running, and the compiler can compile simple build expressions properly.
Given the simple strategy
main = !Foo(1,2)
the following Java code is produced:
public static class main_0_0 extends Strategy
{
public final static main_0_0 instance = new main_0_0();
public ATerm apply(ATerm term)
{
try
{
{
ATerm[] b_0 = new ATerm[2];
{
ATerm c_0 = atermFactory.makeInt(1);
b_0[0] = c_0;
}
{
ATerm d_0 = atermFactory.makeInt(2);
b_0[1] = d_0;
}
ATerm a_0 = atermFactory.makeAppl(atermFactory.makeAFun("Foo", 2, false), b_0);
term = a_0;
}
}
catch(Failure f)
{
return null;
}
return term;
}
}
There are a number of unnecessary blocks in the above code fragment, but that's an artifact of the way I wrote the Java code templates. I'll see if I can't get rid of them eventually.
I've spent some time hacking about in order to get closures working without too much overhead. I think the current scheme will work, but will require a bit of sophistication and context-awareness in the code generator.
You can see the scheme in the example above. Every strategy is compiled to its own class, with an apply method. The signature for this method is not fixed. Rather, the number of strategy and term arguments may vary. The last argument is always the current term. Every class has a singleton instance, called instance. This is how we get the pointer. All context information that's required will have to be passed in, through the argument list.
There are in principle two possible schemes for passing in arguments. The first is to do as Stratego/J (the Stratego interpreter for Java): use two arrays, e.g. ATerm apply(Strategy[] svars, ATerm[] tvars, ATerm currentTerm). This costs two calls to new (in the general case) for every strategy invocation. Not very appealing.
The other possibility is to sequence the strategy and term arguments in the argument list, e.g.:
ATerm apply(Strategy s0, Strategy s1, ATerm t0, ATerm t0, ATerm currentTerm)
The problem here is that the arity of s.apply() is not fixed. We really have:
ATerm apply(Strategy<x0,y0> s0, Strategy<x1,y1> s1, ATerm t0, ATerm t0, ATerm currentTerm)
where x and y are the strategy and term arities, respectively. If we were generating C++ code, we could just use integers here. In Java, we'll have to insert real types. I'm tempted to use enums, and manually define the types N0 through N31. Nobody will ever invent a strategy with more than 32 strategy or term arguments, right?
I'll keep mulling this one over a bit. Feel free to drop me a line if you see better solutions.
December 08, 2007 22:38
October 23, 2007
... and less time to do it!
Since I started my full-time job I noticed that it is a lot harder to stay up to date with all the latest (online) developments. During the writing of my thesis I had a considerable amount of time which I could spend on reading blogs and news-items, and following mailing-lists. When you spend most of your day actually working you learn to prioritize which things you want to read :)
One of the things I did managed to read, albeit a bit late, is
">
">
">this thread on the PHP.internals list written by Wietse Venema. It is a follow-up on
">
">
">this thread in which a first proposal was made to integrate a perl-style taint-mode into the core of PHP. The results posted in the follow-up thread look promising. It is also interesting to see that he has gone from a black-and-white taint-mode, to a more leveled approach. Currently the proposal only contains a subset of the levels available in PHP-Sat, but I think that the most fundamental ones are definitely there.
Even though Wietse is developing a prototype I think he will have a hard time getting this taint-mode into the actual core of PHP. Within both threads the general opinion seems to be that the idea is nice, but the developers of PHP seem to think of many situations in which it could fail. I hope to see more results of this idea soon!
Going over some other interesting threads in the internals-list I found a reference to a tool called PHPLint. (Isn't is funny to see that there are all sorts of initiatives popping up that that try to make PHP more secure/stricter). I haven't have time to take a better look at this tool, but a first glance definitely showed potential. I'll try to examine this tool more thoroughly at the end of this week.
While there is less time to read on-line material, there is more time to read off-line stuff. Since I am using public transportation to get to work I have an extra hour a day to read actual books and publications. One of the books I have read in the past few weeks is a printed version of "Producing Open Source Software" (ProducingOss) written by Karl Fogel. My conclusion: absolutely worth reading!
ProducingOss contains all sorts of tips, hints and best practices. Even if you are not involved in an open source project it is still useful to read. Almost everything in the book can also be applied to closed-source projects. Furthermore, it contains many pointers to other interesting literature. One of these pointers lead me to The Cathedral and the Bazaar, my current read-while-traveling-to-and-from-work book.
I intend to use several things from ProducingOSS within PHP-Sat. I will just have to think about how I can fit the project in my current schedule, but it will definitely be fitted in.
October 23, 2007 20:47
September 01, 2007
You probably already heard about it, but support for PHP version 4 is partly dropped as of 31-12-2007. And after 08-08-2008 the support ends completely.
Luckily, the PHP documentation team provides you with a set of migration guides. Going through these guides can take some time, but it enables you to upgrade your code easily to be PHP5 compatible.
To make your life even easier, the PHP-tools project is extended with the tool test-migration. This tool performs some checks that are described by the migration guide to detect whether the code can be run under version 5 of PHP. These checks include:
- Is there a function definition with the same name as a function newly defined in PHP5?
- Is an object created without the class being defined first?
- Where are the functions strrpos, strripos and ip2long used?
- Is there any place in which there is reflection within PHP that uses changed behavior?
The first two checks are rather easy to understand, PHP5 will simply halt execution with an error when these issues are detected at runtime. Therefore, the warnings that are generated for these kind of patterns are shown with a 'serious'-level.
On the other hand, the last two checks do not find constructions that can halt execution. They detect places in which certain constructions are used. These constructions where already available within PHP version 4, but their behavior changed in version 5. To easily find these constructions they are flagged by a 'minor'-warning. More details about the changed behavior can be found
here.
The first version of this tool is quit basic and performs only a few checks. If you would like to see more check included, don't hesitate to drop me an email or put them in the comments.
September 01, 2007 17:33
July 13, 2007
Finally! It's over! Never again! The defense mostly followed the specified procedure. I first had about 45 minutes to give a presentation of the results of the dissertation, then the first opponent, Neil Jones, gave a 15 minute summary putting my work into a larger context.
After his summary, he proceeded to ask several high-level questions about various parts of the dissertation. One question I liked a lot was (paraphrased): "are the axiom-based Java testing techniques you propose in your case study applicable to Stratego and would you actually use them?". All the tools and prototypes discussed in the thesis are written in Stratego, and are applied to Java, C and a toy language called TIL. However, few of the tools are actually available for Stratego itself. This is the classical story of the cobbler's children's shoes... I certainly think it would be worthwhile to do the work necessary to make some of the the tools available to Stratego as well.
Peter Mosses followed with a series of detailed questions. Clearly, Peter had read the text and figures very carefully, because some of his questions were about rather subtle issues and ambiguities in my work. There were also a few (fortunately minor) mistakes that made some of the figures more difficult to comprehend than necessary. He also nailed me on a very embarrassing definite-instead-of-indefinite article mistake. Normally, these things do not matter very much, but in this particular sentence it sort of reversed one of my main arguments in the dissertation. Whoops;)
After I'd answered their questions as best as I could, they retired to discuss whether my performance was good enough. This is mostly a formality in the current tradition, so I can't say I was very worried at that point. Once they came back, the dean proclaimed my successful completion of the degree, and we all rushed off for some (sadly delayed) champagne and cake.
I even wore a suit, and here's a picture to prove it:

Much thanks to Uwe Wolter, who was the local member of the committe and therefore the grand orchestrator of all the formalities, the formal parts went smoothly. After the defense, the stressful part of the day started: I had to collect all people's menu choices for the evening, send my family and friends shopping for the evening's party, clean the apartment and of course smear huge swaths of marzipan cake all over my suit. Thanks to very good help from my brother, his girlfriend, Håvard (my roommate) and my mother, and Tilde, we managed to get ready just in time to arrive ten minutes late for the scheduled dinner.
Magne and Peter:

Neil and Uwe:

Tilde:

Since Eelco took the pictures, he's not in any of them. Fortunately, Tilde has a few pictures of Eelco, and of the other people present. I'm still waiting for those and will upload a few once I get them (and get some green lights from the people depicted).
After dinner, we all (committee, advisors, friends and family) drove back to my apartment and Tilde whipped up drinks to all. She even tricked one into me;) I was very pleased to see that my office neighbour and student advisor, Ida Holen, found the time to show up. Also, I had friends flying in from Oslo (okay, Holmestrand) and Trondheim for the event, namely Karl Thomas, Karina and Leif Olav. Thanks guys! Hope the drinks and food was worth it;) The usual Bergen posse showed up as well, including Stig, Fay, Knute, Glenn, Espen, Tommy x 2 and Paul Simon (if I forgot somebody, ping me).
July 13, 2007 03:23
July 01, 2007
Within the last two weeks I noticed that PHP-Front and PHP-Sat are being discovered by people that are looking for PHP-specific solutions. It is not that we are flooded with request, but I am still happy with every question :)
The first question was send to the psat-dev-mailinglist and was about the Cyclomatic complexity of PHP code. I replied that it would not get into PHP-Sat because it is not a bug-pattern. However, it would be a nice tool for the PHP-Tools project. I made a similar tool for Java because of an assignment in the past, so it is probably just a matter of renaming the Strategies to use the PHP-Front api. Unfortunately, I didn't get an answer about how the report should look like. If you have any ideas please let me know in the comments, or in the issue.
A second question was about the grammar of PHP-Front, or actually the license of this grammar. The people behind TXL have derived a PHP-grammar for TXL from the SDF-grammar in PHP-Front. Since our license does not state anything about derived work without common source, we were asked for our permission to distribute this new grammar. Naturally, this permission was given very quickly and we were also allowed to take a peak at the source. I must say that I find it interesting, but I currently do not have time to look into it all. I imagine that the definitions of the grammars and TXL itself is similar to how things work in Stratego, but I have to look into that.
The last question in this series is about defined functions. Finding out which functions are defined in a project is easy when classes are ignored, a simple grep on the project will do. When a project also includes classes it becomes trickier to get all functions defined outside of a class. The question was whether PHP-Front could help with this issue, and the answer is of course yes!
Within the reflection part of the library the list of defined functions and classes is already available. This makes it possible to write a tool to show all defined functions in just a few lines of code. Since it was also a nice tool for the PHP-Tools project I added a tool for this last Friday.
Another issue that was brought up by the last e-mail is the issue of our implementation language. Since Stratego is relatively unknown the project has a steep learning curve. On the other hand, if I had chosen a different implementation language it would have taken me way longer to implement the current features. And besides, this piece of code is not that hard to understand right?
defined-functions-main =
include-files-complex
; get-php-environment
; get-functions
; if ?[]
then !"No functions defined."
else map(transform-to-message)
; lines
end
July 01, 2007 16:51
June 28, 2007
Yesterday, Karl Trygve Kalleberg defended his PhD thesis at Bergen University.
The thesis treats the subject of abstractions for program transformation, with the aim of making available
transformation abstractions independent of the source language being transformed. The thesis introduces a number of extensions to the Stratego transformation language and reports on several case studies. I understand the thesis will be available soon.
(More photos)
June 28, 2007 20:01
June 24, 2007
Now that the propagation of safety-types seems to go smoothly it was time to dive into another subject: accessing the location of terms. In this case, the location of a term is defined as the location of the text in the original file that was parsed to that term. Since the strategy to annotate the AST with position-info is available in the standard libraries nowadays, it should be easy to access these locations and finally solve PSAT-91 right? Lets find out!
The first thing I did was to add a separate module to handle the low-level stuff of getting the location annotations. This module contains several getter-strategies that can retrieve, for example, the number of the start-line. The location info is captured in six different numbers: start-line, end-line, start-column, end-column, offset and length. A getter-strategy is available for all of them. Furthermore, the name of the file in which the term is defined can be retrieved.
Although these getter-strategies are useful, they are not meant to be called directly. I figured that the most common use of these functions would be reporting the values in some kind of (formatted) message. In order to capture this kind of behavior the strategy format-location-string(|message) is defined. This strategy takes a message with holes in the form of [STARTLINE] as parameter and fills these holes with values from the current term. A rather useful strategy if I say so myself.
To practice with this new piece of functionality I have added an extra option to the tool input-vector of the php-tools-project. This option allows the user to choose between the normal list, or the same list with line-numbers printed for each access. More information about this option and how to add an option yourself can be found here.
After this was done I moved to php-sat to make the output more concise. It was actually pretty easy to implement. The algorithm is nothing more then get-terms-with-annotations, make-nice-output. I actually spend more time on creating a test-setup for calling php-sat through a shell-script then on generating the more concise format. The only problem was that the adding of position-info everywhere interfered with the dynamic-rules. A few well-places rm-annotations where needed to fix this. Please let me know if you like the new output, or whether something should be added.
The next applications of the location info is the tracking of where untainted data enters an application. When a function is called with a parameter $foo which is tainted, it would be nice to show when it was tainted. I think this is not too difficult to add, but bugs always seem to lurk in 'I-think-it-is-easy-to-add'-features.
A last remark about locations is a small problem without an actual solution. Eventually php-sat must support function-calls. The algorithm to analyze function-calls is not complicated, but how can bug-patterns within a function be reported? A message before each call to this function? Within the file in which the function is defined? And what about cases in which one call is flagged and the other one isn't? And can we also handle object-creation in the same way? I haven't figured out how to handle this, so if you have any ideas please let me know.
June 24, 2007 19:37
There it is. The first version of the 'Domain-Specific Language Engineering' paper. It is still somewhat rough around the edges and can use
more meta-level reflection. Therefore, this is 'Mark I'. I expect at least another version, and maybe two before the final one. Comments on any
aspect of the work would be greatly appreciated. Here is a quote from the introduction:
In recent years there has been an increasing momentum (some
call it hype) for approaches with names as domain-specific
languages, model-driven architecture, software factories,
language workbenches, and intentional programming. While there
are differences between these approaches (mostly of a
technological nature?), the common goal is to achieve a
higher-level of abstraction in software development by
abstracting from low-level boilerplate code. (Making
domain-specific languages the approach of my choice,
I'll use its terminology from now on.) The idea of
domain-specific languages has been around for a long time, but
what seems to be new in the current wave, is the requirement
to use DSL design and implementation as a standard tool in the
software development process. The challenge then is to
develop a systematic method for designing new domain-specific
languages.
This tutorial describes an experiment in DSL design and
implementation. The experiment is simply to take a new domain
(web applications), to develop a DSL (set of DSLs) for this
domain, and observe the process to extract ingredients for a
standard process. The target of the experiment are web
applications with a rich domain model that can serve as
content management system editable via the browser, but also
allow querying and aggretation based on the structure of the
data. The tutorial takes one particular combination of
technologies. The DSL will be a textual language. The
generator targets Java with a certain collection of frameworks
for implementation of web applications. The DSL is implemented
using Stratego/XT, SDF, and Nix.
Eelco Visser.
Domain-Specific Language Engineering. A Case Study in Agile DSL Development (Mark I). Technical Report TUD-SERG-2007-017, Software Engineering Research Group, Delft University of Technology, June 2007. To appear in the proceedings of the
Summer School on Generative and Transformational Techniques in Software Engineering (GTTSE'07).
(
pdf)
June 24, 2007 09:38
June 21, 2007
In February I had promissed 'to report about my journey through webland and towards the GTTSE tutorial on this blog', but have failed miserably. I find I'm not a good
blogger. Not that I have nothing to say. During the process of designing and implementing WebDSL I have thought about dozens of things I could blog
about, but either didn't have the time, or didn't think it would be interesting for anyone but myself.
I regret that; now that I have something that works it would have been interesting to see a log of the process. Not all is lost though. I am finalizing a paper for the GTTSE'07 summerschool in which I explain the design process.
Here's a quote from the paper:
The boilertemplate smell is characterized by similar target
coding patterns used in different templates, only large chunks
of target code (a complete page type) considered as a reusable
programming pattern, and limited expressivity, since adding a
slightly different pattern (type of page) already requires
extending the generator.
High time for some generator refactoring. The refactoring
we're going to use here is called 'find an intermediate
language' also known as 'scrap your
boilertemplate'. In order to gain expressivity we
need to better cover the variability in the application
domain. While implementing the domain model DSL, we've
explored the capabilities of the target platform, so by now we
have a better idea how to implement variations on the CRUD
theme by combining the basics of JSF and Seam in different
ways. What we now need is a language that sits in between the
high-level domain modeling language and the low-level details
of JSF/Seam and allows us to provide more variability to
application developers while still maintaining an advantage
over direct programming.
In preparation of my GTTSE presenation I have given a two part lecture at the
MoDSE Colloquium entitled
Domain-Specific Language Engineering (
part 1,
part 2).
At the
first MoDSE workshop yesterday I gave a talk with the title
A Sweet WebDSL focusing on the role of desugarings in a DSL for web applications.
June 21, 2007 09:36
June 08, 2007
"Software written in one language often needs to construct sentences in another language, such as
SQL queries, XML output, or shell command invocations. This is almost always done using unhygienic
string manipulation, the concatenation of constants and client-supplied strings. A client can then supply
specially crafted input that causes the constructed sentence to be interpreted in an unintended way,
leading to an injection attack. We describe a more natural style of programming that yields code that
is impervious to injections by construction. Our approach embeds the grammars of the guest languages
(e.g., SQL) into that of the host language (e.g., Java) and automatically generates code that maps the
embedded language to constructs in the host language that reconstruct the embedded sentences, adding
escaping functions where appropriate. This approach is generic, meaning that it can be applied with
relative ease to any combination of host and guest languages."
The paper
Preventing Injection Attacks with Syntax Embeddings, A Host and Guest Language Independent Approach
by
Martin Bravenboer,
Eelco Dolstra, and
Eelco Visser has been accepted at
GPCE'07.
(
pdf)
June 08, 2007 09:28