3/14/2012

Should I declare defeat on the research topic of API migration?

Of course, I won't, but perhaps I should! Then, I could turn to lower-hanging fruits in research, which I first need to spot, which I can't though because I am a bit obsessed with API migration (and admittedly some other stuff such as megamodeling). Sigh!

It was around 2004 that I became interested in API migration and I have talked about it here and there ever since. Perhaps I am thinking that talking about a difficult problem of interest helps in discovering the solution of the problem, or at least a sensible path to go. Wishful thinking so far!

In theory, the objective of API migration made a lot of sense while I was on the XML team at Microsoft because there are obviously way too many XML APIs. In practice, nothing happened on this front because I didn't understand automated API migration well enough back then. Add to this that API migration is something that is potentially risky for the API provider and the API migrator. So you need to mash up a rocket scientist and top politician to succeed. I am not yet there.

Back in academia, it took until like 2009 that we had a useful and publishable effort on API migration (see the SLE 2009 paper); just a year later another one (see the ICSM 2010 paper). I kept on working with Thiago in 2010-2012, but our efforts on language support for wrapper- and transformation-based migration hit sort of a brick wall. At least, for now, we take some rest. We have submitted another API migration paper, it's about an advanced technique for automated testing in wrapper development. This research is also backed up by additional wrapper studies.

So we haven't failed, by no means, but we are depressingly just at the stage of wrapper designers and engineers: we understand how to design wrappers (using patterns, for example), how to test wrappers (on the grounds of sophisticated test-data generation and contract assertions), what API differences to expect, how to spot them, and how to respond to them. We would like to be at the stage of language-based API migrators.

What am I supposed to do when a research effort hasn't made the progress that I expected years back when I was too naive? Rather than bailing out, I am going to do two things: a) I am going to compile a talk that deeply analyses what I have learned and what I think could/should be done; b) I am going to compile a funding application so that focused research efforts can target the interesting topic of API migration.

As to the talk, I am looking forward a visit of Suraj C. Kothari at Iowa State University in Ames next week, and here is the plan for this talk. (The trip to Ames is a trip during the trip because I am going to Ames during a trip to Omaha. From a recursion-theoretic point of view, I am obviously interested in carrying out a trip during the trip during the trip. This is certainly a good exercise in trying to understand the difference between left- and right-associativity.)

Regards,
Ralf

Title: API migration is a hard problem!

Slides: [.pdf]

Abstract: API migration refers to software migration in the sense of software reengineering: the objective is to eliminate an application's dependencies on a given API and make it depend instead on another API. Hence, we may speak of original API versus replacement API. In principle, migration can be achieved by a wrapping approach (such that the original API is re-implemented in terms of the replacement API so that the original implementation becomes obsolete and the application itself does not need to be changed) or by a transformation approach (such that the code of the application is rewritten so that the references to the original API are replaced by references to the replacement API). A degenerated case of API migration would be API upgrade or downgrade where the two APIs are essentially versions of each other with an effective relationship between the versions such that the wrapper or the transformation for migration can be derived from a suitably recorded, inferred, or specified relationship. The synthesis of a transformation or a wrapper is considerably more involved when the APIs at hand do not relate in such an "obvious" manner, i.e., when they have been developed more or less independently. The two APIs still serve the same domain (e.g., GUI or XML programming), but they differ in terms of features, protocols, contracts, type hierarchy, and other aspects. In this talk, I provide insight into such differences and explain existing, often primitive (laborious) migration techniques, which are mostly focused on wrapping. I use a number of case studies for empirical substantiation. I conclude with an outlook in terms of the challenges ahead with indications as to the techniques and methods to be used or developed. Program analysis must provide the heavy lifting to make progress on the hard problem of API migration.

Acknowledgement: This is joint work with (in alphabetic order) Thiago Tonelli Bartolomei (University of Waterloo, Canada), Krzysztof Czarnecki (University of Waterloo, Canada), Tijs van der Storm (CWI, Amsterdam, The Netherlands). I also acknowledge joint work within the Software Languages Team on the related subject of API (usage) analysis; special thanks are due to Ekaterina Pek.

Resources:



3/13/2012

More than you ever wanted to know about grammar-based testing

Preamble: Ever since 1999 +/- 100 years, I have been working (sporadically, intensively) on grammar-based testing. The latest result was our SLE'11 paper on grammar comparison (joint work with Bernd Fischer and Vadim Zaytsev). I have tried previously to compile a comprehensive slide deck on grammar-based testing, also with coverage on this blog, but this was relatively non-ambitious. With the new SLE'11 results at hand and with the firm goal of pushing grammar-based testing more into CS education (in the context of both formal language theory and software language engineering), I have now developed an uber-comprehensive slide deck with awesome illustrations for the kids. If you are reading this post ahead of the lecture, if you are still planning to attend, then you are well advised to bring brains and coffee. You may also bring a body bag, in case you pass out or worse. As it happens, this is "too much stuff" for a regular talk, lecture, or any reasonable format for that matter. I will run a first "user study" on this slide deck in a class on formal language theory in Omaha this Thursday; thanks to Victor Winter's trust in the survivability of this stuff, or why would he share his class with me otherwise? As a last resort and an exercise in adaptive talking, I am just going to skip major parts based on (missing) background of my audience. To summarize, if I get under the bus today, then all the grammar-based testing stuff is documented for mankind. (That's what Victor said.)

Title of the lecture: Quite an introduction to grammar-based testing

Slides of the lecture: [.pdf]

Elevator pitch for the lecture: Grammars are everywhere; resistance is futile. (More verbosely: If it feels like a grammar (after due consideration and subject to a Bachelor+ degree in CS), then it's probably just one. Just because some grammars mask themselves as object models, schemas, ducks, and friends, you should not move over to the dark side.) Seriously, non-grammars are cool, but life is short, so we need to focus. (I am sort of focusing on grammars and I am not even @grammarware.) Now, even grammars and grammar-based software components have problems, and testing may come to rescue. Perhaps, you think you know what's coming, but you don't have a clue.

Abstract of the lecture: Depending on where you draw the line, grammar-based testing was born, proper, in 1972 with Purdom's article on sentence generation for testing parsers. Now, computer scientists were really obsessed with parsers and compilers in the last millenium and much work followed in the seventies, eighties, and early nineties. Burgess' survey on the automated generation of test cases for compilers summarized this huge area in 1994. Why would you want to test a compiler: it could suffer from regressions along evolution; it could be different than another compiler that serves as reference; it could fail to comply with the language specification (perhaps even the grammar in there); it could break when being stressed; it could simply miss some important case. Non-automated testing really does not suffice in these cases. You cannot possibly (certainly not systematically) test a grammar-based software component other than by generating test data (in fact, test cases) automatically, unless the underlying grammar is trivial. Grammar-based testing suddenly becomes super-important, when much software turns out to be grammar-based (other than parsers and compilers): virtual machines, de/-serialization frameworks, reverse and re-engineering tools, embedded DSL interpreters, APIs, and what have you. Such promotion of grammar-based testing to the horizon of software engineering was perhaps first pronounced by Sirer and Bershad's paper on using grammars for software testing in 1999. Grammar-based testing is not straightforward, by all means, in several dimensions. For instance, coverage criteria for test-data generation must be convincing in terms of "covering all the important cases" and "scaling for non-trivial grammars". Also, all the forms of grammars in practice are "impure" more often than not; think of semantic constraints represented in different ways. Related to the matter of semantics, any automated test-data generation approach relies on an automatic oracle, and getting such an oracle is never easy. This lecture is going to present a certain view on grammar-based testing, which is heavily influenced by the speaker's research and studies. In addition to the speaker's principle admiration of grammars and grammar-based software, the reason for such obsession with grammar-based testing is that this domain is so exciting in terms of combining formal language theory, (automated) software engineering, and declarative programming. This lecture is an attempt to convey important techniques and interesting challenges in grammar-based testing.

Bio of the speaker: As earlier this week. (Nothing much has happened very recently.)

Acknowledgement: The presented work was carried out over several years in collaboration with (in alphabetical order) Bernd Fischer (University of Southampton, UK), Jörg Harm (akquinet AG, Hamburg, Germany), Wolfram Schulte (MSR, Redmond, WA, USA), Chris Verhoef (Vrije Universiteit, Amsterdam, NL), Vadim Zaytsev (CWI, Amsterdam, NL)

Related papers by the speaker (and collaborators):

Related patent:

Have fun!

Ralf

3/08/2012

Technical space travel for developers, researchers, and educators

The inevitable has happened.
I have committed myself to giving the first major talk on 101companies (not counting the AOSD 2011 tutorial, which described an early view on the universe).
This outing talk happens to be at the CS Department at University of Nebraska at Omaha, as I will be visiting Victor Winter the next two weeks.

Speaker
:
Ralf Lämmel (University of Koblenz-Landau)

Acknowledgement:
Joint work with Jean-Marie Favre, Thomas Schmorleiz, and Andrei Varanovich.

Title:
Technical space travel for developers, researchers, and educators

Abstract:
A technical space is a technology and community context in computer science and information technology. For example, the technical space of XMLware deals with data representation in XML, data modeling with XML schema, and data processing with XQuery, XSLT, DOM, and LINQ to XML. Likewise, the technical space of tableware deals with data representation in a relational database, data modeling according to the relational model or the ER model, and data processing with SQL and friends. There are various other, not necessarily orthogonal technical spaces: Javaware, grammarware, objectware, lambdaware, serviceware, etc. How can we easily travel between spaces such that software products may involve multiple spaces? How can we deal reasonably with the plethora of technologies and languages in computer science and information technology? How can we profoundly experience the universe in a scientifically and educationally relevant manner? We approach these questions in the emerging 101companies project for space-traveling developers, researchers, and educators on the grounds of a wiki, a source-code repository, and an ontology.

Slides: [.pdf]

2/28/2012

More of a discussion on web privacy

I had the pleasure to give a talk today on web privacy and P3P at Ecole des Mines de Nantes in the ASCOLA research team by kind invitation of Mario Südholt. The hidden agenda was to promote our empirical research on P3P but we also agreed upfront to attempt a more general discussion of web privacy. So you find little empirical stuff in the early parts of the slide deck.

Title: More of a discussion on web privacy

Abstract: The presentation begins with observations about the current state of web privacy on the internet today. The presentation continues to set up some challenges for web privacy to be addressed in practice, subject to contributions by CS research. The technical core of the presentation is a language engineer's approach to understanding W3C's P3P language for privacy policies of web-based systems. Discussion during and after the talk is strongly appreciated.

Acknowledgement: This is joint work with Ekaterina Pek, ADAPT Team, University of Koblenz-Landau

Links:

2/05/2012

MegaL goes Nantes

The Software Languages Team in Koblenz, with potent support by visiting scientist Jean-Marie Favre is getting increasingly excited and knowledgeable about megamodels for software technologies and software products. MegaL is the megamodeling language under development. During upcoming research visits, I expect to present MegaL: its rationale, some applications, and ongoing research. The first presentation of this kind is to take place in Nantes in the AtlanMod team. The talk announcement follows.

Title: A megamodel of the ATL model transformation language and toolkit

Abstract: According to http://www.eclipse.org/atl/, "ATL (ATL Transformation Language) is a model transformation language and toolkit. In the field of Model-Driven Engineering (MDE), ATL provides ways to produce a set of target models from a set of source models." We would like to deeply understand the linguistic architecture of ATL in terms of all the involved software languages, metamodels, technologies, and relationships between all of them. To this end, we leverage a suitable form of megamodeling, as it is supported by the (mega)modeling language MegaL. In this manner, we discover some shortcomings of common, informal explanations of ATL and opportunities for highly systematic discussion of ATL.

Acknowledgements: Joint work with Jean-Marie Favre, Martin Leinberger, Thomas Schmorleiz, and Andrei Varanovich.

Links:
  • A related paper on megamodeling: [.html]
  • Slide deck of the talk: [.pdf]

Bio: Ralf Lämmel is Professor of Computer Science at the Department of Computer Science at the University of Koblenz-Landau since July 2007. In the past, he held positions at Microsoft Corp., Free University of Amsterdam, CWI (Dutch Center for Mathematics and Computer Science), and the University of Rostock, Germany. Ralf Lämmel's speciality is "software language engineering", but he is generally interested in themes that combine software engineering and programming languages. His research and teaching interests include program transformation, software re-engineering, grammar-based methods as well as model-driven and model-based methods. Ralf Lämmel is a committed member of the research community; he is one of the founding fathers of the international summer school series on Generative and Transformational Techniques on Software Engineering (GTTSE) as well as the international conference on Software Language Engineering (SLE).

12/08/2011

Ein Weihnachtsgedicht

(C) Professor Fish, aka Ralf Lämmel


Stille Nacht! Unheilvolle Nacht!

Alles schläft; einsam wacht

nur der Bachelor-Student,

der nie pennt,

der nur rennt.

So viele Prüfungen, o weh!

So viele Prüfungen, o weh!


Stille Nacht! Unheilvolle Nacht!

Die wird wieder durchgemacht.

Bester Student, o wie lacht

bös' aus Deinem Gesicht

die Aversion gegen Bologna's Gericht.

So viele Regularien, o weh!

So viele Regularien, o weh!


Stille Nacht! Unheilvolle Nacht!

Was hat man nur aus dem Uni-Studium gemacht?

Es ist nicht Bologna allein.

Andere Trends reihen sich hier ein.

Das Denken an Deutschland in der Nacht,

hat auch den Professor um den Schlaf gebracht.


Lautes Land! Unheilvolles Land!

O Marx, gib mir Eltern,

die Kinder fordern anstatt Lehrer zu quälen.

O Lenin, gib mir Erstsemester,

die wissen wollen anstatt videozugamen.

Früher war alles besser, o weh!

Früher war alles besser, o weh!


Stille Nacht! Unheilvolle Nacht!

Soll es Weihnachten nun sein,

dann Tod den Enten und glühe der Wein.

Der Wunschzettel ist auch schon da.

Merkel & Co. machen es klar.

Merkel & Co. machen es klar.


12/07/2011

A riddle regarding type safety

Of course, I have explained to the students in my language theory class what type safety means (remember: progress + preservation) and we have studied this notion time and again for some of the TAPL languages. However, I don't feel like the notion really sinked in universally. Some students consider all this semantics and calculus stuff as obscure and there is a certain critical mass of notation, when passed, concepts are not understood anymore (by some if not many of the Bachelor-level students in my class).

Last night, I came up with a nice "semantics riddle" for type safety:

What would be a super-trivial language with a type system and an SOS semantics such that type safety is violated?


I gave the students 2 minutes to think about the question.

I emphasized several times that the language should be super-trivial.

I also asked them for a general strategy for mis-designing a language to be type-unsafe.

One idea that popped up was to support some notion of cast.

We agreed that this would not be trivial, certainly not super-trivial.


Here is my reference solution:

Syntax of expressions

e ::= v | z

(Hence, there are expressions (forms) x, y and z.)

Values (normal forms)

v := x | y

Types

t ::= a | b

(a and b are the possible types.)

Small-step relation

z -> x

(Yes, we only have one axiom.)

Typing rules

x : a

y : b

z : b

(Yes, we have only axioms here.)

Demonstration of lack of type safety

The term z is the culprit.

z : b but z -> x and x : a


Regards,
Ralf