Lambda tutorial

These web pages provide a practical introduction to lambda reduction, with a few pointers to more esoteric issues. I'm a linguist, and I have linguists in mind for my audience, so linguistic issues will be emphasized (e.g., a discussion of the interaction of lambda with other binding operators such as "exists" and "forall").

Browser issues: If your browser can handle JavaScript 1.2, and if you have JavaScript turned on, you will be able to evaluate examples by clicking on a button, and to make up your own examples to try; otherwise, the examples will just sit there. For instance, try clicking on the "Reduce" button:

You should see the string "(it works)" appear in the second window. This is a live program written in JavaScript---you can type any expression into the input box and try reducing it (you need to spell out "lambda" in the input, the program will convert it to "λ" in the output). Character problems: As of May 2003, Internet Explorer doesn't seem to be able to print the forall or the exists symbols in the first paragraph above (you may see an empty box). But Opera and Mozilla (both with free versions) seem to do ok. Finally, Mozilla can't deal with the lambda symbol ("λ") in JavaScript output, so for now all the lambdas are spelled out when they are in examples.

Goal: Church's λ-calculus provides a convenient way of representing meanings, whether meanings of programs or of expressions in English or some other natural language. As a result, it is ubiquitous in computer science, logic, and formal approaches to the semantics of natural language. The λ-calculus consists of two things: a formal language and an associated notion of REDUCTION (roughly equivalent to "computation"). In the context of the lambda calculus, reduction is specifically called λ-reduction. What these pages will attempt to do is teach you how to perform λ-reduction accurately and confidently.

Other sources: I recommend Hankin's Lambda Calculi : A Guide for Computer Scientists (Oxford) for a readable survey, and Barendregt's The Lambda Calculus (Elsevier) for a more complete reference work. Barendregt and Barendsen's shorter introduction to the lambda calculus is also excellent, and accessible electronically for free (if the citeseer link ceases to work, I've cached a copy here; note for some mysterious reason pages 2, 3 and 4 are blank). Selinger has an excellent set of lecture notes covering many logical and computational aspects of the lambda calculus here. For a more linguistic perspective, chapter 2 of Carpenter's Type-Logical Semantics (MIT Press) presents a λ-calculus within a framework for describing natural language. The following is too fun not to mention: David Keenan, To dissect a mockingbird: a graphical notation for the lambda calculus with animated reduction .

Acknowledgements: Saber Ben Salem made a number of helpful suggestions.

Contents of this document:

Binding: free versus bound

λ is a binding operator, just like backwards E ("∃") or upside down A ("∀"). Consequently, it always binds something (a variable), taking scope over some expression that (usually) contains occurrences of the bound variable. More practically, λ always occurs in the following configuration:

(λ var body)

Here, "var" is the variable bound by the lambda operator, and "body" is the constituent that the lambda operator has scope over.

For instance:

(λ x (* (+ 3 x) (- x 4)))

Here "x" is the variable (the one immediately after the binding operator), and the body is "(* (+ 3 x) (- x 4))". The body contains two occurrences of "x", and both are bound by the λ operator; any variable token that is not bound is free.
(λ x (* (+ y x) (- x z)))

In this form, the tokens of "y" and of "z" are both free.

Freedom is always relative to a particular expression. If the preceding expression is embedded within a larger one, the free variables can come to be bound:

(∃ y (λ x (* (+ y x) (- x z))))

In this case, the "y" has been bound, and only the "z" remains free.


λ-reduction

The key notion of the λ-calculus is that it is possible to arrive at a logically equivalent expression by means of a process called λ-reduction. In the usual case, λ-reduction is actually a combination of three distinct reduction operations, each of which is discussed below. The key operation, the one that does the heavy lifting, is called β-reduction, and that is operation we will discuss first.

By the way, some people say "λ-conversion" instead of λ-reduction; others reserve "λ-conversion" to refer specifically to a single step in a series of reductions. This tutorial is trying not to be overly pedantic, so for present purposes I don't care.


β-reduction (the heavy lifting)

Nothing happens until a λ-binding form occurs in construction with an argument, thus:

((λ var body) argument)

Once a λ-based binding form occurs with an argument like this, it is possible to reduce the expression to a simpler form by means of β-reduction (sometimes with the help of α-reduction and η-reduction).

The main idea of β-reduction is to replace every free occurrence of the variable "var" in "body" with "argument". For instance, in the form below, both occurrences of "var" in the body are free.


Consequently, after β-reduction, both occurrences get replaced with "argument", and the result is significantly simpler than the original expression.


α-reduction (alphabetic variants)

Unfortunately, applying β-reduction indiscriminately can cause trouble when the body contains binding operators. Consider:


Following the rules of β-reduction, we replace "x" with "z" for a result of "(λ y1 (z y1))", and this result is correct. Note that in the result the λ-operator binds exactly one variable in the body, and the other variable remains free.

But consider what happens in this closely parallel but slightly different situation:

((λ x (λ y (x y))) y) β-reduction: substitute "y" in for "x" in the body "(λ y (x y))" ==>
(λ y (y y))
Wrong result!

The only difference here from the first example is that the argument is "y" and not "z". But now, blindly following the rules of β-reduction would produce the result "(λ y (y y))", in which the λ-operator binds two variables, not just one, which is not correct. The argument variable "y" is said to have been `captured' by the inner λ-operator; another commonly-used expression for this kind of situation is `variable collision'.

The solution is to make use of alphabetic variants. Roughly, two expressions are "alphabetic variants" if they are identical except perhaps for the choice of variable symbols. For instance, the following expressions are alphabetic variants of one another:

(λ x (λ y (x (+ y x))))
(λ z (λ y (z (+ y z))))

To create an alphabetic variant for an expression of the form "(λ var body)", simply replace each free occurrence of "var" in the expression with "new", where "new" is a variable symbol not occurring anywhere in "body". (Since expressions have finite length, as long as there is an infinite supply of variable symbols, it will always be possible to find a suitable variable to serve the role of "new".) This transformation is called α-reduction. The crucial property of the reduced form is that each λ operator binds the same number of variables in the same positions within its body.

Some examples of alphabetic variants:

1. ((λ x x) x) α-reduction on "(λ x x)", substituting "y1" for "x" ==>
((λ y1 y1) x)
2. ((λ x x)(λ x (x x))) α-reduction on "(λ x x)", substituting "y1" for "x" ==>
((λ y1 y1)(λ x (x x)))
3. (λ x (λ x (x x))) α-reduction on "(λ x (λ x (x x)))", substituting "y1" for "x" ==>
(λ y1 (λ x (x x)))

The third example may look surprising; the key fact is that α-reduction specifically targets only free occurrences of the variable in question (free relative to the λ body). Since the second λ binds the last two occurrences of "x", performing α-reduction on the larger form won't touch them.

Now, back to the original problem. The way to deal with "((λ x (λ y (x y))) y)", then, is to first take an alphabetic variant. Because alphabetic variants are guaranteed to be logically equivalent, we can substitute variants in place of the original sub expressions. If we do this cleverly, the "y" and the "z"'s don't interact in the pernicious manner shown above.

1. ((λ x (λ y (x y))) y) α-reduction on "(λ y (x y))", substituting "y1" for "y" ==>
2. ((λ x (λ y1 (x y1))) y) β-reduction, substituting "y" for "x" ==>
3. (λ y1 (y y1))
The result contains exactly one free variable, which is correct. (Note that "y" and "y1" count as entirely distinct variables, as different as "y" is from "x".)

Equivalent steps are taken automatically by the λ-reduction program:


It usually takes some practice to know when it is necessary to use an alphabetic variant. The safest strategy is to automatically apply α-reduction to every binding operator before each application of β-reduction. See if you can work out the right result for this form before clicking on the "Reduce" button; you will need three applications of β-reduction and at least one application of α-reduction:


The program uses the safe strategy, blindly performing α-reduction whenever a binding operator occurs inside a body. Experienced human λ-reducers, however, typically apply α-reduction only when it is absolutely necessary to avoid variable collision (it was necessary at least once in the example immediately above). For beginners, though, the best rule is: when in doubt, perform α-reduction!

(Warning:) I have modified the characterization of β-reduction and of α-reduction somewhat in the interests of simplicity. The simplified versions are perfectly sound, and provide the full expressive power of the λ-calculus. If you're interested in the traditional elaboration, consult either the Hankin book or Berendregt book mentioned above.


η-reduction)

η-reduction says that an expression of the form "(λ x (P x))" is guaranteed to be equivalent to "P" alone, where "P" is any expression (in which x does not occur free). This equivalence is obvious enough, but α-reduction and β-reduction alone do not guarantee it. However, few practical applications of λ-reduction bother to implement η-reduction. In particular, the programs given on this page do not:


Clicking on "Reduce" results in no change.


Examples and practice

Some examples (try to guess the result before clicking):






Looking at these reductions, it is easy to see how λ-reduction gets computational work done, since it is capable of copying, inserting, or discarding the information contained in the argument.

Some more linguistically-related practice. Once again, try writing down the anticipated result before clicking "Reduce":

















Now for something a little more difficult. Find an expression that you can use to replace "X" so that the result is "(John sleeps)" (you will have to type it into the input window in place of "X" to try it):



Obviously, one solution is "John". But another is "(λ P (John P))"--try it! (That is, delete "X" and type in "(lambda P (John P))", then click "Reduce".)

Here's a harder problem: If the input is "((D man) sleep)", and the result is "(∃ x ((man x) & (sleep x)))", what is "D"? (Hint: the answer is an approximation of the meaning of the determiner "some", as in "Some man sleeps". The symbol "∃" is supposed to show up as a backwards E, and is usually read "there exists"; in any case, the program expects you to type it in to the window as the symbol "exists".)


Order of combination versus order of reduction; convergence

It takes some practice to match up λ's with the correct argument. Does the following form evaluate to "(2 3)" or "(3 2)"?


The rule of thumb is that the outermost λ goes with the innermost argument. However, it is necessary to keep careful track of the nesting of parentheses. The following form looks very much like the last one, but the order of arguments is reversed:

Note that order of arguments depends entirely on the structure of the parentheses, and not on the order in which the β-reductions are carried out.

In fact, when a form contains more than one λ that can be reduced, it does not matter which one is reduced first, the result will be the same. (This is known as the Church-Rosser property, or, informally, as the diamond property---see Hankin or Berendregt for lengthy discussion.) For instance, for the last example, we could either apply β-reduction to the innermost λ first, thus:

((λ x ((λ y (x y)) 2)) 3) β-reduction on "((λ y (x y)) 2)", substituting "2" for "y" in the body "(x y)" ==>
((λ x (x 2)) 3) β-reduction, substituting "3" for "x" in the body "(x 2)" ==>
(3 2)
Or we could begin with the outermost λ:
((λ x ((λ y (x y)) 2)) 3) β-reduction on "((λ x ((λ y (x y)) 2)) 3)", substituting "3" for "x" in the body "((λ y (x y)) 2)" ==>
((λ y (3 y)) 2) β-reduction, substituting "2" for "y" in the body "(3 y)" ==>
(3 2)
In other words, no matter what order the λ's are reduced, we always converge on the same simplified result.

That is, when the reduction converges at all---some expressions can't be simplified (e.g., "((λ x (x x))(λ x (x x)))", which reduces to itself, or more pathological examples, such as this (DON'T CLICK ON "REDUCE"!):


For this expression, each so-called reducution makes the expression longer, ad infinitum (try it, on paper!). If you try to evaluate that expression in the browser program, you may crash your browser; My browser enters an infinite loop, and I have to kill the browser.

Various alternative notational conventions

I have been writing predicates and their arguments inside parentheses, thus: "(pred arg)". Often in logic and linguistics, such an expression would be written "pred(arg)", for instance, "f(x)" (as in "f(x) = y"). I have chosen the first, more LISP-like notation because it significantly simplifies the design of the program for automating λ-reduction.

Often, instead of "(λ x (λ y (x y)))", people write "λ x y . x y". For some reason, these people hate to draw parentheses, and they hate to draw λ's. The dot shows where the list of bound variables stops and the (innermost) body begins. Because the scope of the λ operator(s) is not explicitly marked, sloppy use of this abbreviatory convention often leads to expressions that are ambiguous; this in turn often causes beginners tremendous frustration.


The semantics of the λ-operator

The complexities of λ-reduction arise from trying to provide a set of syntactic transformations that are guaranteed to preserve meaning. That is, two expressions that are related by some combination of α, β, and η reduction are guaranteed to mean the same thing. In fact, although λ-reduction is sound, it does not capture the full meaning of the λ-operator.

A simpler, more direct, and more complete way to understand λ is to consider its meaning.

The following rule completely characterizes the meaning of the λ-operator. First, let "[*]^g" mean "the denotation of `*' with respect to the assignment function g", where an assignment function is a function mapping variable symbols onto values, and g[u/v] is that assignment function exactly like g except that it maps the variable v onto the value u.

The semantics of lambda.

The paraphrase, then, is as follows: the meaning of a λ-form that binds the variable v when applied to an argument a is the meaning of the body b evaluated with respect to an assignment function that sets the value of v equal to the denotation of a.

Scheme reference code

You're welcome to view the JavaScript program that runs on this page (you can find it here), which constitutes a kind of definition of λ-reduction. It may be helpful, however, to have a more compact reference implementation, written in R5RS Scheme:

(define (reduce f)                                                          ; 1
  ((lambda (value) (if (equal? value f) f (reduce value)))                  ; 2
   (let r ((f f) (g ()))                                                    ; 3
     (cond ((not (pair? f))                                                 ; 4
	    (if (null? g) f (if (eq? f (car g)) (cadr g) (r f (caddr g))))) ; 5
	   ((and (pair? (car f)) (= 2 (length f)) (eq? 'lambda (caar f)))   ; 6
	    (r (caddar f) (list (cadar f) (r (cadr f) g) g)))               ; 7
	   ((and (not (null? g)) (= 3 (length f)) (eq? 'lambda (car f)))    ; 8
	    (cons 'lambda (r (cdr f) (list (cadr f) (delay (cadr f)) g))))  ; 9
	   (else (map (lambda (x) (r x g)) f))))))                          ;10
If you have a Scheme interpreter, you can call the function like this:
    (reduce '(((lambda x (lambda y (x y))) 2) 3))
    ;Value: (2 3)

    (reduce '((lambda x (lambda y (x y))) 2))
    ;Value: (lambda #[promise 2] (2 #[promise 2]))
Comments: f is the form to be evaluated, and g is the local assignment function; g has the structure (variable value g2), where g2 contains the rest of the assignments. The named let function r executes one pass through a form. The arguments to r are a form f, and an assignment function g. Line 2: continue to process the form until there are no more conversions left. Line 4 (substitution): If f is atomic [or if it is a promise], check to see if matches any variable in g and if so replace it with the new value. Line 6 (beta reduction): if f has the form ((lambda variable body) argument), it is a lambda form being applied to an argument, so perform lambda conversion. Remember to evaluate the argument too! Line 8 (alpha reduction): if f has the form (lambda variable body), replace the variable and its free occurences in the body with a unique object to prevent accidental variable collision. [In this implementation a unique object is constructed by building a promise. Note that the identity of the original variable can be recovered if you ever care by forcing the promise.] Line 10: recurse down the subparts of f.

Combinatory Logic