12/31/2010

The underappreciated banana and its buddy monoid

Note: This is the blog post that goes with the Channel9 lecture "Going Bananas". You find all material (slides, code) linked up on the web site for my Channel9 series; see the section on "Going Bananas". I will add a link to the C9 website with the video as soon as it gets available.

I am really Ok. Well, but I do see bananas (folds) everywhere. This tells me that data processing is quite regular. For instance, MapReduce computations for parallel data processing are essentially folds that extract some intermediate data from the records of a voluminous input---subject to subsequent, possibly distributed and staged reduction. Also, in program transformation and analysis, many data processing steps are compositional, and thereby suggest themselves as being implemented through "large" bananas. Further, in XML processing, many operations are busy with folding over XML trees. There are bananas for everyone---not just for lists. If we were just acknowledging such omnipresence of fold, reasoning about our programs may be drastically simplified.

Many bananas team up with monoids. Again, this is great because this tells me that combination of intermediate results along recursive computations is typically quite regular, too. For instance, MapReduce computations for parallel data processing essentially use monoids to aggregate results and to build up indexes. (The monoidal properties enable parallel composition in so far that we can take any segment of input and process it on a separate machine, and such per-machine results can be combined, in fact, reduced later. We also see easily when we need more than just a monoid. For instance, we may need commutativity in order to be even more flexible with parallel processing schedules.) Also, in program analysis, many data processing steps recurse into substructures, and combine intermediate results in a monoidal fashion. It seems that some people mainly think of monoids as numbers and perhaps lists with the usual culprits for binary operation and identity. If you have never used a monoid for pairs or maps, then you should try it as soon as you can.

One may end up obfuscating programs by under-appreciating the fundamental concepts of bananas and monoids for recursion schemes and algebraic computations. For instance, suppose you design a domain-specific language for parallel data processing, and you suggest certain language concepts for data analysis. Wouldn't it be helpful to point out that the key underlying concept is essentially the one of a list homomorphism, which has been heavily studied from a point of view of data parallelism? This is not a conceived example. Sawzall describes a DSL just like that, and it took me a few hours to see that Sawzall's aggregators boil down to a programming scheme for list homomorphisms with interesting monoids. Likewise, the visitor pattern in OO programming is a bit of a chaos, and so it may help to see something as simple and powerful as a large bananas to consider switching paradigms or develop a simpler view on visitors.

I have compiled a lecture with accompanying code base, to appear on Channel9, which discusses a number of use cases for bananas and monoids. I begin with simple uses of foldr meant to show the generality and expressiveness of this great operator; this also includes a quick discussion of those type classes that can be used for folds on container types other than Haskell's concrete list type. (See code module Foldr.hs for this part.) Then I switch to parallel data processing à la MapReduce, and have a lot of fun with map, foldr, monoids, and friends. (See code module MapReduce.hs for this part.) Then, I switch from lists to heterogenous types, as they are used to model problem domains in programming, e.g., dataypes for ASTs used in interpretation, software transformation and analysis. On such AST types, I discuss large bananas, i.e., bananas that possibly handle many constructors, and the "Scrap Your Boilerplate" style of generic functional programming which involves yet other bananas. (See code modules Large.hs and SYB.hs for these two parts.) I had to be selective in the interest of staying close to a 1h lecture. For instance, folds in the sense of Church encodings, functorial style generic programming, or graph traversals remain unmentioned. I also fail to spend significant time on reasoning about the programs at hand, but I give some further reader pointers as usual. Enjoy, and please provide feedback.

Regards
Ralf

12/21/2010

A brutalized Haskell programmer

In working on a special Xmas lecture for my 1st semester course, I was going through Fritz Ruehr's "The Evolution of a Haskell Programmer" only to notice that there is no imperatively faked version that uses references. This omission could be resolved with the following code:


module Control.Pascal where

import Data.IORef

while :: IORef a -> (a -> Bool) -> IO () -> IO ()
while ref pred body = while'
where
while' = do
v <- readIORef ref
if (pred v)
then body >> while'
else return ()

{- ******************************* -}

import Data.IORef
import Control.Pascal

factorial n
= do
r <- newIORef 1
i <- newIORef n
while i (>0) (
readIORef i >>=
modifyIORef r . (*) >>
modifyIORef i ((+) (-1))
)
readIORef r


For instance:


Prelude> factorial 5
120

12/13/2010

Going bananas

This week I will be visiting Klaus Ostermann and his team. I am going to be a guest lecturer in Klaus' programming languages lecture. I am going to misuse this opportunity for a rehearsal of my upcoming Channel9 lecture on bananas. I eat 2 bananas over the last 42 hours in order to get into the right mood.

The talk announcement follows.

Speaker: Ralf Lämmel, Software Languages Team, Universität Koblenz-Landau

Title: Going bananas

Slides: [.pdf]

Abstract:

Banana is functional programming slang for "fold", say an application of the catamorphic recursion scheme---most widely known in higher-order list processing in the tradition of the Bird-Meertens Formalism and the Squiggol community.

In this talk, I will present various kinds of folds, and thereby show the omnipresence, versatility, and expressive power of folds in different areas of programming. This presentation will integrate work by others and my own in a balanced manner, while aiming at broad coverage of bananas and good accessibility of the subject that is sometimes unjustly considered difficult, if not esoteric. Fold is one of the few swiss army knifes of functional programming. Even if you have to code in Java, knowledge of fold will help you to better understand what you are doing with collections, functors, visitors, or otherwise.

The presentation is structured as follows. First, the basics of higher-order list processing are recalled. Second, folds and friends are examined from the more abstract point of view of iteration over general (abstract) containers as opposed to concrete lists. Third, folds are put to work to serve parallel data processing based on list homomorphisms, and Google's MapReduce and Sawzall approaches are revisited in this context. Fourth, folds are generalized beyond lists, in fact, beyond containers---so that data of arbitrary algebraic datatypes can be processed in catamorphic fashion, thereby entering the area of generic programming. Finally, the "Scrap Your Boilerplate" style of generic programming is presented as a logical continuation of previous work on folds and friends.

Bio: [.html]

11/15/2010

Grammar-based Testing Revisited

I am about to leave for Southampton (this is from where the Titanic started) for a few days to visit Dr. Bernd Fischer, who also calls himself Fish. (Another fish that is!) The idea is to give a talk and have interaction on the topic of grammar-based testing. In fact, I don't tell you any secret, if I say that Bernd is the expert on code generation, and so I am keen to intensify collaboration on that topic as well.

The talk announcement follows.

Speaker: Ralf Lämmel, Software Languages Team, Universität Koblenz-Landau

Title: Grammar-based Testing Revisited

Slides: [.pdf]

Abstract: Testing grammar-based artifacts has been researched for decades. A classical and enduring motivation for such testing is that grammar-based functionality (such as an interpreter or a compiler) may be formally specified (e.g., by means of an attribute grammar) so that the specification lends itself to test-data generation. The test data, in turn, can be used to test a proper implementation for compliance with the specification. Yet other motivations arise, for example, in the context of DSL design, and, more generally, software language engineering.

Despite all such longstanding and recently renewed interests in grammar-based testing, there is no best practice; there are no standard tools. In this talk, I analyze the current, fragmented, insufficiently automated, predominantly ad-hoc approach. To this end, I survey testing approaches and use cases. Finally, I start infering requirements of a more general approach to grammar-based testing.

This is a good time for revisiting grammar-based testing because some categories of industrial users (e.g., compiler writers, who finally start to use parser generators in products) and several major research communities (such as those working on model-driven engineering and software language engineering) face the need for this sort of testing urgently. Also, there are less obviously related communities who touch upon grammar-based testing needs: think of the grammar-like structures in theorem proving, hardware design, and software-product lines.

Bio: Ralf Lämmel is Professor of Computer Science at University of Koblenz-Landau. In his career, he also served at Microsoft Corp., Free University of Amsterdam, Dutch Center for Mathematics and Computer Science (CWI), and University of Rostock. Ralf Lämmel's speciality is "software language engineering" but he is generally interested in themes that combine software engineering and programming languages. Ralf Lämmel is one of the founding fathers of the international summer school series GTTSE-Generative and Transformational Techniques on Software Engineering and the SLE conference (Software Language
Engineering).

11/03/2010

Statistische Analyse der Parkplatz-Situation auf einem Universitätscampus

Wenn sich ein Kandidat und ein Mitbetreuer etwa aus dem Inst. CV des FB4 (Informatik) findet, dann soll ich dieses ansonsten für illustrative oder Frustrations-lösende Zwecke erfundene Bachelor-Thema am Ende wirklich gern betreuen.

Bachelorarbeitsthema

Thema: Statistische Analyse der Parkplatz-Situation auf einem Universitätscampus

Motivation: Auf dem einen oder anderen solchen Parkplatz bzw. Campus gibt es empirische Hinweise, dass es einerseits zu wenig Parkplätze oder anderseits zu wenig verantwortliches Verhalten der Parkplatznutzer gibt. In Vorbereitung dieser Themenstellung hat der Themenstellende beispielsweise am 3.11.2010 um 8.45 eine Zählung auf dem Mitarbeiterteil des Parkplatzes am Campus Koblenz der Universität Koblenz-Landau vorgenommen. Um diese Zeit waren alle bis auf 3 Parkplätze belegt und diese verbleibenden Plätze waren nicht mit handelsüblichen Fahrzeugen einnehmbar wegen der durch das Falschparken Anderer verursachten Verringerung des Platzes. Knapp 1/3 aller Fahrzeuge berührten signifikant bzw. übertraten die Trennlinien. Dies sind alarmierende Hinweise.

Beschreibung: Die Arbeit soll geeignete Daten erheben und auswerten---zum Teil auch unter der Verwendung von digitaler Bildtechnik und Analyse (etwa zur Kennzeichenerkennung). Folgende Forschungsfragen werden vorgeschlagen. a) Was ist der Grad der Belegung der Parkflächen? b) Welche Kapazität wird durch Fehlverhalten verschwendet? c) Gibt es Parkplatz-Benutzer, welche statistisch signifikant durch Fehlverhalten auffallen? Der Kandidat soll gern weitere Fragestellungen beitragen.

Gruss,
PF

10/11/2010

Towards a megamodel for programming and language technologies

Summary

We have started a major effort on megamodeling programmig and language technologies.
At GPCE 2010 and SLE 2010, we have revealed this effort.
Contributions are more than welcome.
Our next milestone is an actual megamodel that covers the software corpus "101companies".


Links

  • The GPCE/SLE megamodeling tutorial: description [.pdf]; slides [.pdf].
  • The GPCE 2010 keynote: website: [.html]; slides: [.pdf].
  • The 101company website at SourceForge developers: [.html].

Acknowledgements

  • All of this is joint work with Jean-Marie Favre and Dragan Gasevic.
  • Thomas Schmorleiz is the incredible developer who helps on 101companies.
  • Wolfgang Lohmann has done a great job on the empirical parts of the keynote.


Background

I have been developing an advanced programming class over the last 3 years. A defining characteristic of the class is that it is supposed to cover a broad range of programming and language technologies---perhaps a much broader range than in many other programming courses elsewhere.

In spring 2010, Jean-Marie Favre, Dragan Gasevic and me had an informal workshop in Malaga, where we developed the idea of teaching software languages (or the broader theme of programming and language technologies) in a manner that heavily relies on a) examples; b) comparison; c) megamodeling.

The tutorial and the keynote reveal this idea in more detail.

Your 101companies contributions are welcome.

Regards,
PF

9/14/2010

Dealing comfortably with the confusion of tongues

That's the title of my invited talk at CHOOSE Forum 2010.

See the abstract below.

I just realized that I am going to meet some esteemed colleagues there: Jean Bezivin, Jean-Marie Favre, Uwe Zdun.


Abstract: Yahweh brought the confusion of tongues upon the people in the plain of Shinar because he did not like their efforts of building the Tower of Babel with its top in the heavens. In IT, we brought the confusion of tongues upon ourselves with the continuous aggregation of ideas for programming (languages), modeling (languages), domain concepts and domain-specific languages, design patterns, APIs, and so on. All this diversity makes a nontrivial software system look like the Tower of Babel—in terms of the involved complexity. That is, a nontrivial software system involves multiple technical spaces, technologies, bridges between them, and harmful amounts of software asbestos. How can IT personnel acquire the relevant skills more effectively? How would we want to prepare an undergraduate in CS for such toughness? These are the questions that we address in this talk. We put to work a combo of technology modeling, metamodeling, and megamodeling—plus a suite of well-designed examples in any lingo that we could think of. (Contrary to this abstract, the talk is going to be very technical; target audience: software engineers.)

Acknowledgment: This is joint work with Dragan Gasevic and Jean-Marie Favre. Further, I acknowledge contributions by some of my students—most notably Thomas Schmorleiz.

9/13/2010

(Mega)modeling Software Language Artifacts


GPCE/SLE 2010 tutorial

Jean-Marie Favre (OneTree Technologies, Luxembourg)
Dragan Gašševic (Athabasca University, Canada)
Ralf Lämmel (University of Koblenz, Germany)

Description

Modern software is typically made of heterogeneous sets of software artifacts including for instance databases, programs, transformations, grammars, models and metamodels, compilers, interpreters, formats, ontologies, frameworks, APIs, schemas, configuration files, makefiles, etc. In practice particular languages, tools, implementations, and standards are used such as SQL DDL, Saxon, XLST, Java, Hibernate, XSD, OWL, DOM, Antlr, UML, XMI, Ecore, Awk, and so on. In the absence of a conceptual framework it is difficult to understand the relationships between these software artifacts, if any. The goal of this tutorial is to provide such a framework, showing that the similarity and relationships between techniques can be modeled at a high level of abstraction, and even more importantly that recurring patterns occur in such models. Some of these patterns, for instance those involving ““bridges”” between technologies, would be really difficult to grasp without a proper conceptualization. As a result software engineers and researchers usually find it hard to understand the intricacies of technologies that are out of their area of expertise and it is more than likely that they do not realize the analogies that exist between heterogeneous technologies. This tutorial aims to unveil these recurring patterns and to show participants coming from different horizons how to model the technologies they design or work with in an uniform way and how to situate them into the overall software language landscape.

In the first part of the tutorial, the notions of software languages and technical spaces are briefly presented with a special emphasis on their unifying character. Then fundamental relations such RepresentationOf and ElementOf are introduced forming the basis of a (mega)modeling framework. Recurrent patterns based on these relations are then presented, allowing to describe for instance the ““conformance”” relation between let’’s say a program and a grammar, an xml file and an xsd schema, or an uml model and its metamodel, etc. More complex patterns such as bridges between technologies (e.g. XML <==> Relational, OO <==> XML, etc.) are defined following the same approach. Though this notion of bridges seems easy to grasp informally at the first sight, it often leads to a rather large and complex set of technologies that are hard to understand and compare without an appropriate framework.

In the second part of the tutorial, the use of (mega)modeling framework is illustrated through its application in three different technical spaces: Grammarware, Modelware and Ontologyware. Concrete examples of various degree of complexity are provided in each case, with again an emphasis on similarities between technical spaces. The hope of this approach is that it should be possible for someone with some knowledge in technical spaces (let’’s say grammarware) to improve significantly his or her comprehension about another space (let’’s say ontologyware), and this by virtue of analogy. It is our believe that the (mega)modeling approach, by raising the level of abstraction and focusing on essential software language concepts, enables both to better understand complex structures involving many heterogeneous software artifacts, but also to better apprehend new technologies coming from other spaces.

One of the objectives of the tutorial is to show that bridges can be successfully established between heterogeneous technical spaces such as Modelware, Ontologyware and Grammarware, and in particular to go beyond traditional divisions of fields of expertise. This tutorial being directly inscribed into a ““community engineering”” perspective, we believe that having three presenters coming from different horizons would be the best way to insist on the integrative aspect of the mega-modeling approach.

Presenters

  • Jean-Marie Favre is a software anthropologist and a software language archeologist. He is principal scientist at One Tree Technologies. He has published numerous papers and coedited a book (in French) Beyond MDA: Model Driven Engineering. He has given tutorials and keynotes in more than dozen of international events and summer schools and has organized various national and international events. His research interests include software language engineering, software linguistics, software evolution and reverse engineering, model driven engineering and research 2.0.
  • Dragan Gašševic is a Canada Research Chair in Semantic Technologies and an Associate Professor in the School of Computing and Information Systems at Athabasca University. His research interests include semantic technologies, software language engineering, technology-enhanced learning, and service- oriented architectures. He has (co-)authored numerous research papers and is a led author of the book "Model Driven Engineering and Ontology Development." He has given tutorials at many well-known conferences such as WWW, ISWC, and CAiSE.
  • Ralf Lämmel is Professor of Computer Science at University of Koblenz-Landau. In his career, he also served at Microsoft Corp., Free University of Amsterdam, Dutch Center for Mathematics and Computer Science (CWI), and University of Rostock. Ralf Lämmel is generally interested in the combination of software engineering and programming languages. Together with the other tutorial speakers and further researchers, he is one of the founding fathers of the SLE conference. He is one of the founding fathers of the summer school series GTTSE--Generative and Transformational Techniques on Software Engineering.

Content

The tutorial will be divided in various parts introducing first the key issues relative to modeling heterogeneous software language artifacts, then presenting the conceptual framework for mega-modeling, and finally examples of application in different technical spaces.


Provisional outline

  • Introduction and objectives (5’’)
  • Technical Spaces and Software Languages (5’’)
  • Megamodeling fundamentals (10’’)
  • Application to Grammarware (10’’)
  • Application to Ontologyware (10’’)
  • Application to Modelware (10’’)
  • Synthesis and conclusion (10’’)

Relevance to GPCE/SLE attendees

Because of its emphasis of software language artifacts, this tutorial should be of interest to SLE attendees. Compilers, transformations and generators being extensively used in generative programming the tutorial should also attract GPCE attendees. We believe that providing a conceptual and unified approach for mega- modeling and showing its application across various technical spaces could improve cross fertilization between communities.

9/12/2010

The essence of "The essence of functional programming"


If "code is the essence", then this blog post is concerned with the essence of "The essence of functional programming" --- because the post matches a code distribution that is derived from P. Wadler's paper of just that title.

The code distribution has two branches: "origin" --- where it indeed stays close to the original code of the paper, and "overhaul" --- where it uses the Haskell's monad/monad transformer library to show some of the examples in modern style. This code has been initially prepared for a Channel9 lecture about monads. The code is available through the SourceForge project developers; browse the code here; see the slide deck for the Channel9 lecture here.

Here is a more complete reference to Wadler's paper:

P. Wadler: The essence of functional programming
POPL 1992

Henceforth, we refer to this paper as "the paper".
Recovery and modularization effort due to Ralf Lämmel.


Disclaimer

The code distribution has not been approved, in any form, by the original author (P. Wadler). The author of the code distribution (R. Lämmel) takes all responsibility for any imprecision or misconception of this recovery and elaboration effort. Comments are welcome, as usual. Also, the particular way of distributing (modularizing) the code by means of heavy cpp usage is not meant to suggest any useful style of functional programming. It is merely used as a convenience for organizing the code such that code reuse is explicitly captured.


The paper's code vs. this code distribution

The branch "origin" is meant to stay close to the code in the paper. In particular, Haskell's monad (transformer) library is used only in branch "overhaul". We identify the following deviations (in branch "origin"):
  • Most importantly, the original paper did not organize the code in any way that would explicitly represent code reuse. In contrast, the code distribution leverages cpp to this end. This approach may be viewed as an ad-hoc cpp-based product-line-like approach.
  • We use the type class Show for "showing" values and for "getting out of the monads".
  • We added a few samples; they are labeled with code comments.
  • Minor forms of name mangling and refactoring are not listed here.
  • We renamed a few operators to better match the current Haskell library:
unit... -> return
bind... -> (>>=)
error... -> fail
zero... -> mzero
plus... -> mplus
out... -> tell


How to understand and run the code?

One should probably have modest knowledge of monads. For instance, one may read the paper. The easiest approach towards understanding the code distribution is to study the final interpreters in subdirectory "cache". (These interpreters were assembled from pieces, but this may be less important initially.) With two exceptions, these are all complete Haskell 98 programs (modulo TypeSynonymInstances and FlexibleInstances for the sake of Show) with a main function with test cases corresponding to the samples in the paper. The interpreters correspond to scenarios in the paper, and there is a corresponding comment at the beginning of each file to identify relevant sections in the paper. A complete list of all interpreters follows below.

If you want to better understand the structure of the interpreters and their relationships, you may want to have a look at the primary sources that are preprocessed via cpp. Each interpreter is assembled through cpp #includes; see the files in subdirectory "templates". A more profound description of our ad-hoc cpp-based product-line-like approach follows below.


List of interpreters in the order of occurrence in the paper
  • Baseline.hs: This is a non-monadic, CBV interpreter. This is the (non-monadic) baseline for all experiments. That is, we will turn this interpreter into monadic style, instantiate the monad parameter, perform data extensions and selective code replacements. We will also vary CBV vs. CBN, and we will use CPS eventually.
  • CBV.hs: This is the scheme of a monadic-style CBV intepreter. This file is incomplete as it lacks a concrete monad. (Don't run this file.) Some of the following interpreters complete this scheme. Such completion typically involves data extension and selective code replacement.
  • Identity.hs: This is CBV.hs completed by the identity monad.
  • Errors.hs: This is an interpreter with error messages. To this end, we instantiate the monad parameter of our monadic-style CBV interpreter with the error monad, and we apply selective code replacement so that error messages are actually thrown by the interpreter.
  • Positions.hs: This is an elaboration of Errors.hs so that position information is maintained and accordingly used in the error messages.
  • State.hs: This is an interpreter with a reduction count for function applications including additions. To this end, we instantiate the monad parameter with the state monad, where the state models the reduction count. We also need to apply selective code replacement so that function applications are actually counted. We also perform a data extension to provide a language construct for reading the reduction count within the interpreted language.
  • Output.hs: This is an interpreter with output. To this end, we instantiate the monad parameter with the writer monad, so that output is aggregated as a string. We also perform a data extension to provide a language construct for producing output within the interpreted language.
  • Nondeterminism.hs: This is an interpreter with non-deterministic choice. To this end, we instantiate the monad parameter with the list monad, so that nondeterminism is modeled with a multiplicity of result values. We also perform a data extension to provide language constructs for the failing computation (that produces the empty list of results) and for nondeterministic choice, indeed.
  • Backwards.hs: This is a variation on State.hs with backwards propagation of state.
  • CBN.hs: This is the scheme of a monadic-style CBN interpreter. This file is incomplete as it lacks a concrete monad. (Don't run this file.) Some of the following interpreters complete this scheme.
  • StateCBN.hs: This is a variation on State.hs with CBN.
  • NondeterminismCBN.hs: This is a variation on Nondeterminism.hs with CBN.
  • CPS.hs: This is a non-monadic, CBV and CPS interpreter.
  • Callcc.hs: This is a monadic-style interpreter (CBV) that uses the continuation monad to provide "call with current continuation" (callcc). We perform a data extension to provide callcc as a language construct within the interpreted language.
  • ErrorsCPS.hs: This is a variation on Errors.hs which leverages the continuation monad and uses the Answer type to plug errors into the interpreter.
  • StateCPS.hs: This is a variation on State.hs which leverages the continuation monad and uses the Answer type to plug state into the interpreter.

Code organization of the distribution

Directories:

  • cache: ready-to-run interpreters
  • templates: cpp templates for interpreters
  • monads: monads used in the interpreters
  • types: syntactic and semantic domains
  • functions: functions making up the interpreters
  • terms: sample terms for the interpreters
  • baselines: interpreter outputs for regression testing

Some types and functions require variation for the different monads and also in terms of the choice CBV vs. CBN. The subdirectory cba (read as call-by-"any") hosts definitions that are shared by CBV, CBN, and CPS. (Here, we should note that the CPS experiments use CBV (not CBN).) Within directories types and functions, the subdirectories cbv and cbn host definitions specific to CBV or CBN respectively. Monad-specific variations are hosted in yet deeper subdirectories with the (one-letter) name of the corresponding monad; see, for example, functions/cbv/E.

Comments and questions and contributions welcome.

Ralf

9/10/2010

Bulk mailing for your conferences not appreciated

Dear Dr. Nagib Callaos,

For some years now, my mailbox is flooded by CFPs for your many conferences such as WMSCI 2011 (for which you are General Chair etc.). My email address is rlaemmel@gmail.com. Please remove me from your bulk mailing system. (I have tried your instructions for unsubscribing in the past, obviously, w/o success.)

Keep up the good work.

Thanks,
Ralf Lämmel

PS: It is not easy to set up a mail filter that addresses this problem reliably over the years. Or does anyone have an idea as to how to stretch gmail's filter mechanism? I can think of a rule that checks whether a) we face a CFP or some other form of conference ad, and b) certain keywords (names) occur on the website of the conference. Now, gmail, please provide us with that expressiveness :-)

8/24/2010

A bunch of interpreters using CPP and Haskell

Introduction

This text is about the code distribution that you find here (@ SourceForge project developers).

Released: 24 Aug 2010
Last updated: 31 Aug 2010

We walk through the evolution of a simple interpreter. This exercise is meant to explore functional programming in the context of language interpreters. We start from a trivial language which can generate natural numbers. Step by step, we add some language constructs that make us deal with partiality of interpretation, type tests, a binding mechanism, and different forms of recursion.

None of these evolution steps are particularly advanced. The contribution of this code distribution lies more in the pedagogical style of making simple, well-understood extensions, and reflecting on the functional programming issues that pop up as well as the code-level organization and reusability that is achievable. (For the actual reflection, I recommend the upcoming Channel9 lecture on language interpretation that will use some part of the code base.)

The interpreters are programmed in a naive denotational style. We do not use monads for modularization until somewhat late in the exercise. Also, we do not make any effort to add language constructs in a type-safe, modular way (c.f., the Expression Problem). Instead, we use CPP to incrementally extend the interpreter; we also subdivide the interpreters into chunks of code so that we can reuse and replace chunks (files) selectively along evolution. Arguably, we use low level techniques for setting up a kind of product line for simple interpreters.


Code organization

Each interpreter is subdivided into code units as follows:
  • Imports: any imports needed to use Haskell's library.
  • Syntax: the algebraic datatype for the abstract syntax.
  • Value: the designated domain for the types of results.
  • Domains: other types used by the interpreter.
  • Interpreter: the semantic function (the interpreter).
  • Algebra: composition functions for semantic meanings.
  • Library: reusable, auxiliary functions for the interpreter.
  • Locals: not so reusable auxiliary functions.
  • Test: any code needed for testing the interpreter.
  • Main: the main function for testing the interpreter.

Running the stuff

The code is available from SourceForge.

Contributions are welcome.

All interpreters are readily cached in subdirectory Haskell/src/cache.

You can rebuild and test all this stuff very easily if you have:
- ghc(i) (tested with version 6.10.4)
- gcc cpp (tested with version 4.2.1)
- make (tested with GNU Make 3.81)
Go to subdirectory "Haskell/src" and type in "make".

CPP is used to glue together the interpreters from the code units.


Description of the versions of the interpreter

  • 0-ConstAdd: This is an evaluator for Const/Add expressions. (If you have seen the C9 lecture on the Expression Problem, this option only serves as a reminder that we have already dealt with interpreters back then.)
  • 1-ZeroSucc: Let's pick a trivial starting point for interpretation. This language has a constant operation, Zero, and the successor operation, Succ. This is the basis for Peano-like natural numbers. Semantically, we represent natural numbers through Haskell's Ints. Hence, this tiny interpreter allows us to build Haskell Ints from Peano's Zero and Succ. That's nothing short of impressive. :-)
  • 2-ZeroSuccPred: So far, our interpreter was a total function (assuming non-bottom arguments). We add a predecessor construct to the abstract syntax. Since our (intended) semantic domain is natural numbers, our interpreter function gets partial in this manner. That is, we need to rule out that we compute the predecessor of zero. We need to rewrite existing equations and use "partial function application" all over the place.
  • 3-NB: We add a few more primitives to the interpreter: operations for Booleans. In fact, we arrive at the NB language of Pierce's TAPL (except that we leave out False and True (because they can be expressed already (as you may want to show))). This step of evolution adds something new insofar that we start to consider different result types. Hence, we need to perform partial projections on results, if some operation of the interpreter requires a certain type. (For instance, the successor operation is only applicable to natural numbers.) In principle, NB also brings up typing, but we do not get into issues of static semantics/typing here.
  • 4-Lambda: We add the lambda calculus to NB. Adopting common terminology, our composed beast is best called an applicative lambda calculus. We are ready for functional programming now. (We are Turing complete, too, for what it matters.) For instance, we can define recursive, arithmetic operations. To this end, we leverage the CBV fixed-point operator which is representable as a lambda expression.
  • 5-Letrec: We add a recursive let construct to the applicative lambda calculus. This is a minor extension in terms of interpreter design. That is, it is a straightforward data extension---we do not leave the ground of our current semantic model. Conceptually, this is an interesting extension though because we can now define proper nested, recursive bindings. Our Turing-complete language gets closer to a proper functional programming language. In fact, this is as much expressiveness as we get. The two remaining versions of the interpreter only vary style of its definition.
  • 6-Monads: The signature of the interpreter function involves partiality (c.f., Maybe Value) and environment passing (Env -> ...). The actual interpreter involves considerable plumbing to deal with partiality and environment passing. This makes it hard to focus on the core meaning of the interpreter's equations. We migrate to monadic style to raise the level of abstraction, and to make parts of the interpreter more oblivious to partiality and environment passing, where possible. We compose the required monad through monad transformation---using a Maybe and a Reader transformer on top of the identity monad. (Arguably, one may object to the use of the reader monad for environment passing. The resulting semantics of lambda abstraction shows the difficulty of using the monad.)
  • 7-Bananas: Finally, we separate recursion of the interpreter from the actual algebra of meanings of constructs. In this manner, we clarify the intention of compositionality for such a denotational-style interpreter. Also, this step presents the generalization of folding that is established for lists, which however can also be used for domain-specific algebraic datatypes such as types of abstract syntaxes.

Comments and questions and contributions welcome.

Ralf

PS: This is not rocket science, but it may be pretty useful in teaching interpretation and executable semantics and functional programming, no?

8/06/2010

Lecture series on advanced (functional) programming concepts

[

Added 10 Aug 2010:

Here is the link to the first lecture's video: "The Expression Problem".

]

If you are a programming nerd or something of a similar species, you may know of C9 Lectures at http://channel9.msdn.com/tags/C9+Lectures. Several of the lectures are given by Erik Meijer who has turned many chapters of Graham Hutton's book "Programming in Haskell" into videos. In my recent, recorded lecture on "Programming Paradigms and Formal Semantics", I also incorporated Graham's text, but you should really watch Erik's lectures, if you haven't---they are more fun. I am inspired by his ability to present relatively simple content in such a crisp, insightful, and enjoyable manner.

In the aforementioned lecture, I covered all kinds of "crazy stuff" though, and in the end, I had to deliver body bags to my Bachelor students---in the form of a radically simple examination. Thanks a lot btw to @grammarware for helping with this matter. Thanks a lot to my students who coped with this deadly mix of process algebra, semantics, Haskell, constraints, type systems, and what have you. Some of the topics from the lecture may be of interest though to advanced programming nerds (and similar species), and this is where this post fits in.

Based on an ECOOP 2010 tutorial proposal, which was never implemented, Charles Torre (@C9, this is how he looks like) and Erik Meijer suggested in January 2010 that I could do a few C9 Lectures on advanced (functional) programming concepts. So the idea would be to dive deeper into subjects that were introduced in Graham's chapters, and to generally include some advanced subjects, also possibly related to the tension between mainstream programming and Haskell programming. Well, this sounded cool!

It took me some time to come up with a concept that is hopefully useful for both---the community, say developers or CS students who want to dive deeper into programming, and me who wants to aggregate and improve material that helps with my general occupation in academic and industrial research and education.

Perhaps I have figured out an option.


What this bunch of lectures will not be like:

  • A detailed Haskell programming tutorial.
  • An otherwise technology-centric discussion.
  • A totally Haskell-centric lecture series.
  • A never-ending lesson in Greek (Math).
  • An otherwise academic discussion.
  • A pure functional programming story.
  • An OO bashing series.


Here is what I will try to cover:

  • Advanced topics in functional/OO programming.
  • Properties of programs such as modularity or purity.
  • Composition of programs from function combinators.
  • Multi-paradigm programming (functions, objects, predicates).
  • Software Language Engineering (incl. language processing).

The first few lectures
  • Let's begin the series with the Expression Problem because it connects OOP and FP ingeniously; it helps understanding pattern matching, dispatch, function application, encapsulation in a combined fashion that complements established means of introducing those subjects in isolation. In this lecture, I don't present solutions to the problem---because they are too hard and too idiosyncratic. I rather present non-solutions because they are insightful and make you ask for more (I hope). See the slide deck here.
  • Let's continue with a lecture on Haskell's type classes. Thereby, we cover one of Haskell's supernatural powers. Again, I am not describing type classes in the manner of a language manual; I rather demonstrate a few of the mechanisms that are supported by type classes; I will also discuss a bit the underlying principles. For instance, type classes allow us to solve the expression problem like a charm, and to also address several sophistications thereof. It will take a while until C# and Java are on par with type classes. See the slide deck here.
  • I will discuss language interpretation because it is such a fundamental subject in computer science and programming that also allows us to deeply understand some properties and styles of functional programming. Unfortunately, language interpretation is often obfuscated by academic style of presentation---as if it was something purely mathematical. So this will be the challenge for me to find many advanced Java and .NET programmer who say in the end that language interpretation is a "must-know".

More keywords:
  • Effects (monads)
  • Continuations
  • Algebraic structures (e.g., monoids)
  • Parallelism (e.g., MapReduce)
  • Genericity (e.g., "Scrap your boilerplate")

For the rest of the lectures, please see C9 or follow-up posts on my blog. I have started to upload decks and code for lectures to the open-source project https://sourceforge.net/projects/developers/. You can browse the SVN repository and download files here. If everything goes as planned, then the first video will go live at C9 somewhere next week.

I am really excited to work on a lecture series like this because I feel that it allows me to focus on non-trivial, conceptual stuff. This is a welcome break from the typical Bachelor-level lecture where I am supposed to go slowly so that "no student is left behind". In the present lecture series, I enjoy the privilege of diving deep into concepts, and I assume that folks will figure out many details for themselves. I inject riddles into the lectures here and there; some of them are incredibly hard, but you will figure out I hope, and you will gain understanding no matter what---even if the riddles are too hard (perhaps infeasible) in the end. Enjoy, and please let me know how I can improve the concept of the lecture series within the implicit and explicit bounds that I described above.

Thanks,
Ralf

6/11/2010

Software adaptation in Koblenz in Sep --- Summer School

Please spread the message.
Looking forward the 1st ADAPT Summer School.

Regards,
PF

CALL FOR PARTICIPATION : ADAPT 2010

1st ADAPT Summer School
September 26 - October 2, 2010, Koblenz, Germany

Software adaptation is a topic that crosscuts much of software engineering, programming languages, information systems, formal methods, and other major areas of computer science. The CS department at the University of Koblenz-Landau engages in a research focus on software adaptation, and thereby connects scientists and teams in classical areas of computer science as much as application areas such as web-based systems, SOA, mobile systems, and robotics.

ADAPT 2010 is the first edition in a series of summer schools on software adaptation. The program of ADAPT 2010 appeals to PhD students and possibly specialized Master's students in computer science.

3/31/2010

A solution to the Halting problem

[ This is not about this Halting problem. ]

The Halting problem

The problem is how to halt execution of Ali Hussain Sibat. I cite CNN as of 31 March 2010: "A Lebanese man charged with sorcery and sentenced to death in Saudi Arabia is scheduled to be beheaded on Friday [PF: 2 April] the man's lawyer said Wednesday [PF: 31 March]." Amnesty International has reported about this case on 8 Dec 2009 [Article from the Amnesty International Website]. I became aware of the case by CNN's article on 19 Mar 2010. Update: CNN reported on Thursday, 1 April 2010 that Sibat "won't face beheading Friday [PF: 2 April], his lawyer said Thursday [PF: 1 April]".


The solution to the Halting problem

I am pretty sure that Sibat's lawyer, his family, Amnesty International, Lebanese government and many other individuals and organizations are working frantically to stop the beheading. Here is an online petition that I ask you to sign if you agree with me that sorcery does not qualify for the death penalty. It takes a lot of public outcry to solve this Halting problem. This is what I call a solution: a) beheading does not take place + b) the public outcry, also with the contribution of the informed and outspoken online society, made a difference.

Update: I have meanwhile (11pm CEST, 1 April 2010) contacted the Saudi Embassy in Germany, and brought the petition to their attention. I also asked them to pass on the information to the Saudi Ministry of Justice whose email address / fax address is not available on the web. I will sent an update once the petition reaches 1000 signatures and then again for 10.000 signatures.

Update: Join this Facebook group which fights for the same cause. This group is growing rapidly and it is great information centre on Sibat's case, when compared to my humble blog post. The group endorses the petition.


Disclaimer

Please note that the petition and this post do not attempt to make any general statement about Saudi Arabia or Lebanon, their juridical systems or governments (Saudi Arabia's, in particular), about Sharia, Islam, or you name it. Instead the petition and this post is concerned with the specific case of Ali Hussain Sibat and with the charge and death penalty in this case in particular.

A personal note: I condemn with my whole authority the charge and the punishment in question on the grounds of information provided by Amnesty International and CNN reporters modulo personal judgement.


Privacy issues

You are only required to share your email address with PetitionsOnline. Their privacy rules are such that they do not share these email addresses with anyone for whatever reason. PetitionsOnline only uses the email addresses for some forms of validation. If you decide to share your email address with me (the petition's initiator), I will not share your email addresses with anyone for whatever reason. The primary purpose of the petition is to be online and allow us to refer the target of the petition to it.


Thanks to everyone who has signed already.

Thank you
Ralf Lämmel

3/28/2010

Proprietary hardware jokes

Suppose you own a 2007 model of the then-cutting-edge, high-end laptop of a given brand A with say 4 GB of Ram, an 2.4 GHz Intel Core 2 Duo. Further suppose you want to purchase the current display offering of the same brand (available since Oct 2008), wouldn't you expect to be able connecting both expensive pieces of hardware without yet further investment, say without purchasing a converter that's not even available from brand A? Of course, I should get a 2010 model of an A-branded laptop, and everything works fine until I guess 2011, when they replace the Mini DisplayPort by say a new "Nano DVI port".

Suppose you are very happy with the display, the computing power, the browser speed of your lovely smartphone, again from the same brand A. Occasionally, you would like to write some longer emails or edit online documents, or ride on a wave, for which it would make terribly sense to have an external keyboard. Technically, it is trivial to enable this through bluetooth, unless brand A decides to cripple the bluetooth driver specifically, in which case you can always buy the new iPad-like device of brand A, for which a keyboard is available (unless you are living in Europe, where you can't purchase that device yet).

PF

2/25/2010

Our corpus is your corpus.

[ Here is our P3P corpus, be it your corpus as well. ]

Giving away your corpus (in empirical language analysis or elsewhere) is perhaps nothing extremely established, but it's nothing original or strange either. It does make sense a lot! Stop limiting the impact of your research efforts! Stop wasting the time of your community members! Sharing corpora is one of these many good ideas of Research 2.0: see SSE'10 (and friend events), eScience @ Microsoft, R2oSE, ...


Computer Science vs. Science

When you do academic CS research in programming- or software development-related contexts, then the culture of validation is these days such that you are often expected to provide online access to your program, application, library, tool, what have you as an implementation or illustration. There are various open-source repositories that are used to this end--as a backend (a storage facility), but any sort of author-hosted download locations are also used widely. In basic terms, if you write a paper, you include a URL. (There is one exception: if your work leverages Haskell, you can usually include the complete source code right into your paper so that one gets convenient access through copy and paste. Sorry for the silly joke.) Metadating-wise, common practice is nowhere perfect, but it's perfect compared to what follows.

When you do empirical analysis in CS, which results in some statements or data about software/IT artifacts, then the culture of validation is essentially the one of science. In particular, reproducibility is the crucial requirement. You describe the methodology of your analysis in a detailed manner. So you define your hypotheses, your input, your techniques for measurement, your results (which you also interpret), your threats to validity, what have you. Downloads aren't integral with science. What would you want to download anyway?


Message of this post?!

I suggest that various artifacts of an empirical analysis in CS, in general; in empirical language analysis, in particular, qualify for a valuable download. In this post, I want to call out the corpora (as in corpora of source projects, buildable projects, built projects, runnable projects, ran projects, demos, etc.).


Beyond reproducibility in CS

What's indeed not yet commonplace (if done ever) is that the corpora underlying empirical analyses are given convenient access to. Consider for example Baxter et. al's paper on structural properties of Java software, or Cranor et al.'s paper on P3P deployment. These are two seminal papers in their fields. I would loose a night of sleep over each of the two corpora.
Wouldn't it be helpful for researchers if such corpora were made available for one-click download incl. useful metadata, potentially even tooling? Let's suppose such convenient access became a best practice. First, reproducibility would be improved. Second, derived research would be simplified. Third, incentives for collaboration would be added.

I contend that convenient access adds little pain for the original author, but adds huge value for the scientific community. Why should we need to execute the description of some corpus from some paper, if it requires substantial work for us, but the corpus would be easily shared by the primary author. Why should we work hard to "reproduce the corpus" if some little help by the original authors would make reproducibility (of the corpus and most of the research work, perhaps) a charm.


Naysayers -- get lost

I can think of many reasons why 'convenient access' is not getting off the ground. Here are few obvious options:
  • "It's extra work, even if it is little extra work." This problem can be solved if incentives are created. For instance, publications on empirical analysis with 'convenient access' to the corpus could be rated higher than those w/o. Also, just like tool papers in many venues, there could be corpus papers.
  • "There is sufficient, inconvenient access available already." At least, for one of the two examples above, I fully understand how I could go about gathering the corpus myself, but I have not executed this plan, even though I could really use this corpus in some ongoing research activity. It's just too much work for me. I am effectively hampered in benefitting from the authors' research beyond their immediate results.
  • "Provision of convenient access is too difficult." Think of a corpus of Java programs. Suddenly, an access provider gets into the business of configuration management. After all, convenience would imply that the corpus builds and runs out of the box. I think the short-term answer is that access to the corpus w/o extra "out-of-the box" magic is still more convenient than no access. The long-term-answer is that we may need a notion of remote access to corpora, where I can give you access to my corpus in my environment, through appropriate, web-based or service-oriented interfaces.
  • "Convenient access gives a head start to the competition." I refuse to believe that this is really too relevant in academic practice. For instance, I am sure that the research groups behind the above-mentioned papers have no "corpus monopoly" in mind. I have not done much work on empirical analysis, but I have experience with papers that "give away details", and I must say that those papers which give away the most typically coincide with those which have the highest impact in all possible ways.
  • "There is copyright & Co. in the way." Yes, it is. This is a serious problem, and we better focus on solving the problem shortly, if we want to get anywhere with science and (IT) society in this age. This post will just explode if I tried to comment on that issue over here. There are many good ideas around on this issue, and we all understand that some amount of sharing works even now in this very imperfect world as we have it. If you are pro-Research 2.0, don't get bogged down by this red herring.
Well, I can think of quite a number of other reasons, but I reckon that all the usual suspects have been named, and everything else can be delegated perhaps to some discussion on this blog or elsewhere.

Regards,
Ralf Lämmel

PS: CS is of an age that empirical research is becoming viral and vital. I am grateful for talking to Jean-Marie occasionally with his lucid vision of Research 2.0 and linguistics for software languages---two topics that are strongly connected. Empirical analysis of software languages has got to be an integral part of software language linguistics. Specialized software-engineering conferences like SLE, ICPC and MSR or even big ones like ICSE or OOPSLA include empirical research for a while now.

2/24/2010

An ambitious course on programming paradigms, semantics, and types

We have completed this course.
I hope others find the design or some of the material helpful.
See more about reuse below.

Coverage
- Parsing and interpretation in Prolog
- Basics of small-step and big-step semantics
- Basics of untyped and typed lambda calculi
- Introduction to Haskell
- Basics of denotational semantics
- Denotational semantics in Haskell
- Basics of static program analysis
- Static program analysis in Haskell
- OO programming in Haskell
- The Expression Problem
- Basics of Constraint-Logic Programming
- Basics of Process Algebra (CCS)
- ... a few more specialized lectures

Characteristics
- English as the course language
- Slides, videos, exercises available online publicly
- 42 hours (45mins each) of lectures over 4 months
- 12 programming-oriented, interactive labs
- Transparent scheme for midterm and final exam
- Heavy reuse of material from other courses
- Use of Twitter for notification and aggregation

Experiences
This course is my only chance to tell many students at my department about semantics, type systems, bits of interesting functional programming, declarative language processors, and sophisticated declarative methods such as process algebra or constraint-logic programming. Luckily there was a similar module in our curriculum, but I admit that I had to hijack this module a bit. (That is, the module is designed for the last semester of Bachelors, whereas the actual course is probably meant for Masters.) The students have complained in a friendly manner that the course is very heavy, but in return, I have designed the exam to be feasible for Bachelors. So everyone is happy: me because I could do loads of material and topics; the students because they receive unusual help in preparation for the exam. As to the content, students are relatively happy with the techniques that we studied---in particular because we are running the course in such a pragmatic mode, where we basically leverage Prolog and Haskell throughout---for every bit of type systems and semantics and programming concepts. In this manner, the CS students can improve the programming skills in paradigms and domains to which they used before. I also admit guilty by saying that there was no way of selling the formal details of many topics. (For instance, an attempt to do soundness proofs almost triggered a riot.) As to the use of twitter, it seems that not enough students were keen to use Twitter; so we had to continue using a newsgroup in parallel.

But again, I am confident that the design of such a course with broad coverage, low formal depth, but operational depth based on declarative programming and a transparent exam is an effective way of exposing Bachelor students (possibly some of them future Master students) to critically important foundations of computer science and modern programming techniques.

The future
I will mature the course next year. I will be happy if others use the design of the course to some extent, or any specific material. Please note that more than half of the material is directly based on other people's work. I probably used 3 resources heavily, and about another 7 for some bits. Each non-original slide has credits on it in the header area. If you want to reuse anything, please let me know.

Acknowledgement
Vadim Zaytsev did an excellent job in running the lab and helping me to prepare the exam at the level of quality that we had in mind. It is good to have such a committed (emerging) declarative programmer next to you for a course like this! As to the reused material, detailed credits are online and on the slides, but I want to emphasize a few sources because of the degree of reuse and my admiration of their work: Hanne Riis Nielson and Flemming Nielson (for their book "Semantics with Applications" with slides); Graham Hutton (for his wonderful introduction to Haskell); Jaakko Järvi (for the slides of his mainly TAPL-based programming course).

Regards,
PF

2/21/2010

Empirical Language Analysis

We have been trying to understand the language P3P. Ok, its pretty simple to say what it is. It's a small (domain-specific, non-executable) language for privacy policies and it's used by web sites and checked by user agents (potentially part of browsers). Arguably, understanding is limited if you look at online samples and syntax definition alone.

So we figured we had to
understand usage of the language
in order to understand the language.

Here is the link to the paper:
Joint work with Ekaterina Pek

Keywords:
  • Software Language Engineering
  • Domain-Specific Languages
  • Empirical Analysis
  • Policy Languages
  • P3P
Abstract: P3P is the policy language with which web sites declare the intended use of data that is collected about users of the site. We have systematically collected P3P-based privacy policies from web sites listed in the Google directory, and analysed the resulting corpus with regard to metrics and cloning of policies, adherence to constraints, coverage of the P3P language, and a number of additional properties of language usage. This effort helps understanding the de-facto usage of the non-executable, domain-specific language P3P. Some elements of our methodology may be similarly useful for other software languages.

Regards,
PF

1/19/2010

Constructive art on software languages

If you (feel like you) need to explain a piece of art, then something sucks. But I am not saying that Haskell's Tower of Babel is true art. Perhaps it could be called constructive art (or nerdy art). Anyway, I am going to explain you how I constructed this picture. See here for the original twitpic with the tower, but it's included right below for your convenience.


















This is how you can reproduce the tower:
  • Use something like Mac OS's Keynote to actually assemble the picture.
  • Go through a nifty Haskell project and gather all Haskell 98 extensions.
  • Use one line per pragma and format everything with a proportional font as shown.
  • Sort the extensions (the lines) by the visual length. Results depend on the font chosen.
  • Starting at the bottom, adjust font size per line to get a smooth slide from top to bottom.
In my instance, there was some fuzzy effect due to the used proportional font and the tension between the upper case LANGUAGE and the camel-cased name of the extension in each line. Say, while I was generally trying to decrease font size as I was going from bottom to top, I had to actually increase it every now and then--if I wanted that smooth slide from top to bottom.

Background for non-Haskell nerds: In contemporary Haskell, programmers tend to express their dependence on non-standardized features by using detailed pragmas in their modules. For instance, if someone places the pragma for "FunctionalDependencies" in a module, then this means that class declarations are allowed to restrict multi-parameter type classes to model functions on types as opposed to general relations.

At some point, it just occurred to me how organized we Haskell programmers are and how much expressivity we have at avail. From a more historical point of view, we may just run into the kind of problem that is symbolized by the Tower of Babel, but honestly I don't feel like I am suffering from too much expressivity and too sexy types.

As a data point, the project from which I extracted the above language pragmas is a tutorial on type-level programming (joint work with Oleg Kiselyov). The first two lectures of this tutorial have been recorded and posted. I have presented that material for folks with pretty much no Haskell background. Several of the lectures (not those currently available online) require advanced background in typing and programming languages (also measured by the number of language pragmas shown), but the plan is ultimately to get this tutorial to the point that everyone with modest interest in language design and type systems can fully appreciate all that expressivity and sexy typing.

PF

1/03/2010

Palindrom dates

@lizettegagne re-tweeted as follows earlier today (or yesterday +/-): "RT @mashable: Happy Palindrome Day everyone - 01022010! Last one was 10022001, prior to that 08311380!! Wow! They don't come around often." I hacked up the following Prolog code to get some certainty. Simple modification reveals that European ordering or YYYYMMDD ordering would lead to entirely different results. Try it out!

PF


?- main.
08311380
10022001
01022010
true.

% This is what a palindrom is ...
palindrom(S) :- reverse(S,S).

% Generate all palindrom dates; ignoring leap years.
special(MDYS) :-
year(_,YS),
month(M,MS),
day(M,_,DS),
append(MS,DS,MDS),
append(MDS,YS,MDYS),
palindrom(MDYS).

% Here is an over-approximation of dates.
year(X,S) :- ranges(4,1380,2010,X,S).
month(X,S) :- ranges(2,1,12,X,S).
day(1,X,S) :- ranges(2,1,31,X,S).
day(2,X,S) :- ranges(2,1,29,X,S).
day(3,X,S) :- ranges(2,1,31,X,S).
day(4,X,S) :- ranges(2,1,30,X,S).
day(5,X,S) :- ranges(2,1,31,X,S).
day(6,X,S) :- ranges(2,1,30,X,S).
day(7,X,S) :- ranges(2,1,31,X,S).
day(8,X,S) :- ranges(2,1,31,X,S).
day(9,X,S) :- ranges(2,1,30,X,S).
day(10,X,S) :- ranges(2,1,31,X,S).
day(11,X,S) :- ranges(2,1,30,X,S).
day(12,X,S) :- ranges(2,1,31,X,S).

% Generate all strings of values in that range with that length
ranges(L,Min,Max,X,S2) :-
( Min =< Max, X = Min, name(Min,S1), pad(L,S1,S2)
; Min < Max, Min1 is Min + 1, ranges(L,Min1,Max,X,S2)
).

% Extend a string to the length given
pad(L,S1,S2) :-
length(S1,L1),
( L1 > L, abort
; L1 = L, S2 = S1
; L1 < L, pad(L,[0'0|S1],S2)
).

% Print all solutions
print(S) :- atom_codes(A,S), write(A), nl.
main :- findall(S,special(S),Ss), maplist(print,Ss).