An illustrative online partial evaluator
Let's understand the basic mechanics of a (too?) simple online partial evaluator. (I am grateful to William Cook for encouraging the following Haskell exercise, and for sharing with me some stimulating slides and notes on the subject.) I recommend John Hatcliff's excellent and detailed course material on "Foundations of Partial Evaluation and Program Specialization"---available online. For those in a hurry, try Wikipedia. Anyway, partial evaluation, macro systems, and multi-stage programming is a lot of fun. Here is also a paper (draft) by William Cook that I can warmly recommend.
Consider the following syntax of a simple, pure, first-order, functional programming language:
Thus, a program consists of a main expression and bindings for functions (i.e., a "function environment" FEnv is a list of "function definitions" FDef each of them being of term form (s,(ss,e)) with s as the function name, ss the formal parameters and e the body). Evaluation of an expression is supposed to return an integer (or to diverge or to fail in reply to a dynamic type error). The functions are supposed to be of first-order: functions are neither passed as arguments, nor returned as results.
Here are some function definitions---including those for factorial and exponentiation:
Let's first develop a straightforward interpreter.
We expect interpretation to behave as follows:
Here is the complete interpreter:
It must be emphasized that this interpreter is in no way special, unusual, or targeted at partial evaluation. One gets this interpreter naturally when using denotational semantics (i.e., compositional functional semantics) as the guiding principle. Now the key "challenge" of our effort here is to modify the interpreter so that it effectively serves as a partial evaluator. That is, the partial evaluator should produce the same result as the interpreter when faced with a ground main expression, but it should compute a residual expression otherwise. Let's think of the residual expression as the "rest of the program" that needs to be evaluated once the free variables are bound to values. For instance:
Thus, the function "exp" is applied to a specific exponent 3, but the base remains a variable "x", and the result of partial evaluation essentially represents the code to compute "x" to the power 3: "x*x*x*1".
Our design of a partial evaluator starts from the insight that the type of the top-level function must be changed so that expressions are returned instead of integers. (A more general design would also return any number of bindings for specialized functions that were encountered along partial evaluation.) Please note that integers are trivially embedded into expressions through the constant form of expressions. Thus:
In the chosen style of partial evaluator design, we also propagate the type change into the definition of variable environments. (This leads to a simple development, but implies some issues as we will discuss towards the end of the blog post.) Thus, variables may be bound to (free) variables:
The cases for constant expressions differ trivially only as follows:
Variables are to be treated in a special manner. The interpreter can only deal with ground main expressions, and---subject to other reasonable assumptions about well-typedness of programs---variables would always be bound throughout interpretation. In contrast, the partial evaluator faithfully deals with free variables: a free variable is partially evaluated to itself.
Compound expression forms are also affected non-trivially but systematically. The general idea is that an expression form may be evaluated if some or all operands can be, in turn, partially evaluated to values, or the expression form may need to be re-generated in the result otherwise. Consider the following cases for binary expressions:
Thus, in the interpreter, subexpressions are recursively evaluated, and the corresponding operator is applied to the intermediate results. Subexpressions are also recursively evaluated in the partial evaluator, but the intermediate results are to be checked this time such that the operator is applied for the case of value results (constants). Otherwise, a binary expression is generated which incorporates partially evaluated operands as opposed to the original operand expressions.
The same idea is applied to the IsZero construct:
The pleasant status of IfZero is that it is enough to partially evaluate the condition to a value in order to be able to eliminate the conditional form.
Perhaps surprisingly, the cases for function application can be identical (modulo renaming):
In both cases, we look up the function from the function environment, we bind (partially) evaluated actual parameters to the formal parameter variables, and we proceed with the (partial) evaluation of the function's body relative to the local environment.
Exercise: Is there any issue of variable capture with the partial evaluator at hand? Can you demonstrate a problem? How would you generally rule out variable capture? Further, can you think of situations where the partial evaluator clones too much code or where it even loops, even though a non-cloning and non-looping partial evaluator would be desirable and feasible, perhaps subject to some form of memoization?
The source code of this blog post is available in the repository for the Software Language Processing Suite; see here. You would also find a not so simple partial evaluator discussed in the next blog post.
Regards,
PF
Consider the following syntax of a simple, pure, first-order, functional programming language:
type Prog = (FEnv,Expr)
type FEnv = [FDef]
type FDef = (String,([String],Expr))
data Expr
= Const Int
| Var String
| Binary Op Expr Expr
| IfZero Expr Expr Expr
| Apply String [Expr]
deriving (Show)
data Op = Plus | Times
deriving (Show)
Thus, a program consists of a main expression and bindings for functions (i.e., a "function environment" FEnv is a list of "function definitions" FDef each of them being of term form (s,(ss,e)) with s as the function name, ss the formal parameters and e the body). Evaluation of an expression is supposed to return an integer (or to diverge or to fail in reply to a dynamic type error). The functions are supposed to be of first-order: functions are neither passed as arguments, nor returned as results.
Here are some function definitions---including those for factorial and exponentiation:
lib :: FEnv
lib
= [ ("neg", (["x"], Binary Times (Var "x") (Const (-1)))),
("add", (["x", "y"], Binary Plus (Var "x") (Var "y"))),
("minus", (["x", "y"], Binary Plus (Var "x") (Apply "neg" [Var "y"]))),
("times", (["x", "y"], Binary Times (Var "x") (Var "y"))),
("inc", (["x"], Binary Plus (Var "x") (Const 1))),
("dec", (["x"], Binary Plus (Var "x") (Const (-1)))),
("fac",
( ["x"]
, IfZero (Var "x")
(Const 1)
(Apply "times" [Var "x", Apply "fac" [Apply "dec" [Var "x"]]]))),
("exp",
( ["x","n"]
, IfZero (Var "n")
(Const 1)
(Apply "times" [Var "x", Apply "exp" [Var "x", Apply "dec" [Var "n"]]])))]
Let's first develop a straightforward interpreter.
We expect interpretation to behave as follows:
> eval (lib, Apply "fac" [Const 5])
120
> eval (lib, Apply "exp" [Const 2, Const 3])
8
Here is the complete interpreter:
type VEnv = [(String,Int)]
eval :: Prog -> Int
eval (fe,m) = eval' m []
where
eval' :: Expr -> VEnv -> Int
eval' (Const i) ve = i
eval' (Var x) ve = fromJust (lookup x ve)
eval' (Binary op e1 e2) ve = f v1 v2
where
f = op2f op
v1 = eval' e1 ve
v2 = eval' e2 ve
eval' (IfZero e1 e2 e3) ve = if (v1==0) then v2 else v3
where
v1 = eval' e1 ve
v2 = eval' e2 ve
v3 = eval' e3 ve
eval' (Apply n es) ve = eval' e ve'
where
(ns,e) = fromJust (lookup n fe)
vs = map (\e -> eval' e ve) es
ve' = zip ns vs
op2f :: Op -> (Int -> Int -> Int)
op2f Plus = (+)
op2f Times = (*)
It must be emphasized that this interpreter is in no way special, unusual, or targeted at partial evaluation. One gets this interpreter naturally when using denotational semantics (i.e., compositional functional semantics) as the guiding principle. Now the key "challenge" of our effort here is to modify the interpreter so that it effectively serves as a partial evaluator. That is, the partial evaluator should produce the same result as the interpreter when faced with a ground main expression, but it should compute a residual expression otherwise. Let's think of the residual expression as the "rest of the program" that needs to be evaluated once the free variables are bound to values. For instance:
> peval (lib, Apply "exp" [Var "x", Const 3])
Binary Times (Var "x") (Binary Times (Var "x") (Binary Times (Var "x") (Const 1)))
Thus, the function "exp" is applied to a specific exponent 3, but the base remains a variable "x", and the result of partial evaluation essentially represents the code to compute "x" to the power 3: "x*x*x*1".
Our design of a partial evaluator starts from the insight that the type of the top-level function must be changed so that expressions are returned instead of integers. (A more general design would also return any number of bindings for specialized functions that were encountered along partial evaluation.) Please note that integers are trivially embedded into expressions through the constant form of expressions. Thus:
-- Interpreter
eval :: Prog -> Int
eval (fe,m) = eval' m []
where
...
-- Partial evaluator
peval :: Prog -> Expr
peval (fe,m) = peval' m []
where
...
In the chosen style of partial evaluator design, we also propagate the type change into the definition of variable environments. (This leads to a simple development, but implies some issues as we will discuss towards the end of the blog post.) Thus, variables may be bound to (free) variables:
-- Interpreter
type VEnv = [(String,Int)]
-- Partial evaluator
type VEnv = [(String,Expr)]
The cases for constant expressions differ trivially only as follows:
-- Interpreter
eval' :: Expr -> VEnv -> Int
eval' (Const i) ve = i
-- Partial evaluator
peval' :: Expr -> VEnv -> Expr
peval' (Const i) ve = Const i
Variables are to be treated in a special manner. The interpreter can only deal with ground main expressions, and---subject to other reasonable assumptions about well-typedness of programs---variables would always be bound throughout interpretation. In contrast, the partial evaluator faithfully deals with free variables: a free variable is partially evaluated to itself.
-- Interpreter
eval' (Var x) ve = fromJust (lookup x ve)
-- Partial evaluator
peval' (Var x) ve
= case lookup x ve of
Just e -> e
Nothing -> Var x
Compound expression forms are also affected non-trivially but systematically. The general idea is that an expression form may be evaluated if some or all operands can be, in turn, partially evaluated to values, or the expression form may need to be re-generated in the result otherwise. Consider the following cases for binary expressions:
-- Interpreter
eval' (Binary op e1 e2) ve = f v1 v2
where
f = op2f op
v1 = eval' e1 ve
v2 = eval' e2 ve
-- Partial evaluator
peval' (Binary op e1 e2) ve
= case (e1', e2') of
(Const v1, Const v2) -> Const (f v1 v2)
_ -> Binary op e1' e2'
where
f = op2f op
e1' = peval' e1 ve
e2' = peval' e2 ve
Thus, in the interpreter, subexpressions are recursively evaluated, and the corresponding operator is applied to the intermediate results. Subexpressions are also recursively evaluated in the partial evaluator, but the intermediate results are to be checked this time such that the operator is applied for the case of value results (constants). Otherwise, a binary expression is generated which incorporates partially evaluated operands as opposed to the original operand expressions.
The same idea is applied to the IsZero construct:
-- Interpreter
eval' (IfZero e1 e2 e3) ve = if (v1==0) then v2 else v3
where
v1 = eval' e1 ve
v2 = eval' e2 ve
v3 = eval' e3 ve
-- Partial evaluator
peval' (IfZero e1 e2 e3) ve
= case e1' of
(Const v1) -> if (v1==0) then e2' else e3'
_ -> IfZero e1' e2' e3'
where
e1' = peval' e1 ve
e2' = peval' e2 ve
e3' = peval' e3 ve
The pleasant status of IfZero is that it is enough to partially evaluate the condition to a value in order to be able to eliminate the conditional form.
Perhaps surprisingly, the cases for function application can be identical (modulo renaming):
-- Interpreter
eval' (Apply n es) ve = eval' e ve'
where
(ns,e) = fromJust (lookup n fe)
vs = map (\e -> eval' e ve) es
ve' = zip ns vs
-- Partial evaluator
peval' (Apply n es) ve = peval' e ve'
where
(ns,e) = fromJust (lookup n fe)
es' = map (\e -> peval' e ve) es
ve' = zip ns es'
In both cases, we look up the function from the function environment, we bind (partially) evaluated actual parameters to the formal parameter variables, and we proceed with the (partial) evaluation of the function's body relative to the local environment.
Exercise: Is there any issue of variable capture with the partial evaluator at hand? Can you demonstrate a problem? How would you generally rule out variable capture? Further, can you think of situations where the partial evaluator clones too much code or where it even loops, even though a non-cloning and non-looping partial evaluator would be desirable and feasible, perhaps subject to some form of memoization?
The source code of this blog post is available in the repository for the Software Language Processing Suite; see here. You would also find a not so simple partial evaluator discussed in the next blog post.
Regards,
PF
I talked with Ralf about this. While this is a good start at a partial evaluator, which emphasizes the similarities with normal evaluation, there are some serious problems with this approach. I recommend against putting expressions into the environment. Also, the "call" expression should create a new specialized version of the function, and this function should be memoized. Otherwise the partial evaluator will not terminate, even though the original program would terminate. I look forward to the next version that Ralf posts.
ReplyDeleteThanks. William. The more complicated partial evaluator is in the linked repository. I agree with all the identified issues, and thanks for pointing them out to me and others. I am not sure that I want to write another blog post about the advanced version. Well, let me see. :-)
ReplyDelete