Implementing the Lambda Calculus

This article is part-02.1 of the series A Quick Taste of Programming Languages in my blog.

The series is tend to discuss some basic concepts in PL theory and their implementation from a beginner’s perspective. I use Scala code for demonstrations since I’m currently learning it.

The full list of code in this article can be found here. It’s released under the MIT license.

Building a world, with empty hands #

For me, the most enjoyable part in learning programming languages is to witness how a huge and complex system can be built from simple rules. Here we are going to reproduce this magical process.

Calculation process can be described in many different ways, but for now we will focus on the lambda calculus, a computational model introduced by Alonzo Church in the 1930s. Although the lambda calculus is originally invented for formalizing the concept of effective computability, i.e., determining which problems can be solved, it has now became the cornerstone of all functional programming languages.

As we will see, with several basic terms and a single evaluation (reduction) rule, the lambda calculus is capable of representing nearly all features and elements in modern programming languages. In fact, radical functional programming languages like Haskell insist to be pure, which means that they can be fully reduced to lambda calculus.

In this article, we will implement (yet another) lambda calculus interpreter. You can regard the lambda calculus itself as a tiny but universal programming language, which is is much more powerful than languages we defined in the introduction chapter.

Definition #

The wizard’s trick is surprisingly simple: lambda calculus is all about defining functions and specifying how functions are applied.

Terms in lambda calculus can be defined in context-free grammar recursively as follows: $$ \begin{align*} expression\ &:=\ name |\ function |\ application \newline function\ &:=\ λ\ name.expression \newline application\ &:=\ expression \ expression \end{align*} $$ We can easily define them in Scala:

abstract class lambda_Expr 
case class lambda_Name(n: String) extends lambda_Expr
case class lambda_Function(head: String, body: lambda_Expr) extends lambda_Expr
case class lambda_Application(e1: lambda_Expr, e2: lambda_Expr) extends lambda_Expr

Everything in lambda calculus is an expression. A “name” (also called a “variable”), is a symbol used for specifying elements when a function is applied. It doesn’t have further meaning. So don’t regard it as a “container” holding some value.

Here is a simple function: $$ λx.x $$ The name “x” after the “λ” specifies the argument of this function (all functions in lambda calculus only take single argument) and is called the “head” of the function. The expression followed the dot, which is also an “x” in this case is the “body” of the function.

Applying a function to an expression is called an “application” in lambda calculus. Here is an example: $$ (λx.x)(λy.y) $$ We pass an expression $(λy.y)$ as argument to the function $(λx.x)$. You may already notice that the function $(λx.x)$ is an identity function that returns exactly whatever you pass to it. So the result of this application is: $$ (λx.x)(λy.y)=(λy.y) $$ For now our examples are extremely simple, but something noteworthy have already shown up:

Functions in lambda calculus are “pure”, which means that given the same inputs, a function will always return the same outputs, regardless of the context. This is not true to imperative languages like C/C++.
Functions can be passed as arguments and returned as results in lambda calculus, which is also not true to languages like C.
Though we haven’t formally introduced the way to evaluate lambda expressions, you can find that in lambda calculus, transformations of expression only happen at function applications.

All programs in lambda calculus are single expressions. An expression can be constructed in a arbitrarily complex way. But as long as it is legal, a result is always guaranteed.

Evaluation #

To evaluate (or reduce) a lambda expression, all we have to do is repeating the process of function application until we reach a point, where no more applicable function can be found. The central work of function application is, in the function body, substituting all names bound by the function head with the actual argument passing to the function. In order to implement the substitution method, we need to figure the binding relationship of names first.

Binding relationship #

Consider the following example: $$ λx.(xy) $$ The function body is an application $(xy)$. For this function, we say the name $x$ is a “bound variable” bound by $λx$, and $y$ is a “free variable” since it is not bound by any $λ$.

Notice that binding relationship is a relative concept. For the application $(xy)$, both $x$ and $y$ are free. For the function $λx.λy.(xy)$, both $x$ and $y$ are bound. We can’t determine a name is bound or free without a giving lambda expression.

Here is another example: $$ λx.(x(λx.x)) $$ The function body is an application $(x(λx.x))$. Here, we say the first $x$ is bound by the first $λ$, and the second $x$ is bound by the second $λ$ but not the first one.

Formally speaking, a $name$ is free in an given expression if one of the following three cases holds:

$name$ is free in $name$.
$name$ is free in $λ\ name_1.exp$ if the identifier $name \neq name_1$ and $name$ is free in $exp$.
$name$ is free in $E_1E_2$ if $name$ is free in $E_1$ or if it is free in $E_2$.

These rules can be directly translated into Scala code:

def isFree(v: String, e: lambda_Expr): Boolean = e match {
  case lambda_Name(_) => true
  case lambda_Function(head, body) => (v != head) && isFree(v, body)
  case lambda_Application(e1, e2) => isFree(v, e1) || isFree(v, e2)
}

A $name$ is bound in an given expression if one of two cases holds:

$name$ is bound in $λ\ name_1.exp$ if the identifier $name = name_1$ or if $name$ is bound in $exp$.
$name$ is bound in $E_1E_2$ if $name$ is bound in $E_1$ or if it is bound in $E_2$.

Implement it in Scala:

def isBound(v: String, e: lambda_Expr): Boolean = e match {
  case lambda_Name(_) => false
  case lambda_Function(head, body) => (v == head) || isBound(v, body)
  case lambda_Application(e1, e2) => isBound(v, e1) || isBound(v, e2)
}

Notice that a name can be both bound and free in the same expression at the same time. For example, $x$ is bound and also free in the application $(x (λx.x))$.

Alpha equivalence #

If two functions always return the same results when they are fed same inputs, we say these two functions are equivalent.

In lambda calculus, it’s obvious that changing the head and variables bound by the head to a different free variable in the body will not influence the function’s behavior. Here are some examples of equivalent functions (we use “$\equiv$” to denote equivalent relation): $$ (λx.x) \equiv (λy.y) \equiv (λt.t) \newline λx.(x(λx.x)) \equiv λa.(a(λx.x)) \equiv λa.(a(λb.b)) $$ Such equivalence between lambda terms is called alpha equivalence.

Beta reduction #

The evaluation rule of function application is to substitute the input expression for all instances of variables bound by the function head. When the substitution step is done, the function’s head is eliminated, leaving the function body as the result. This process is called beta reduction.

There are some examples of beta reduction (we use “$=$” to denote reduction step): $$ (λx.x)y = y \newline (λx.(xx))y = (yy) \newline (λx.(x(λx.x)))y = (y(λx.x)) $$ With that, we can implement an evaluator for natural semantics of the lambda calculus, which means the input expression doesn’t contains any free variables:

def closed_eval(e: lambda_Expr): lambda_Expr = e match {
  case lambda_Name(_) => throw new RuntimeException("input term isn't closed.")
  case lambda_Function(_, _) => e
  case lambda_Application(e1, e2) => {
    val func = closed_eval(e1) match // any better way to avoid using asInstanceOf[lambda_Function]?
      case lambda_Function(head, body) => lambda_Function(head, body)
    val arg = closed_eval(e2)
    closed_eval(substitution(func.head, arg, func.body))
  }
}

The substitution function can be implemented like:

def substitution(from: String, to: lambda_Expr, in: lambda_Expr): lambda_Expr = {
  if isFree(from, in) then {
    in match {
      case lambda_Name(n) => if n == from then to else in
      case lambda_Function(head, body) => lambda_Function(head, substitution(from, to, body))
      case lambda_Application(e1, e2) => lambda_Application(substitution(from, to, e1), substitution(from, to, e2))
    }
  }
  else in
}

Above evaluator is quite limited since it doesn’t evaluate function body. For example, it cannot reduce inputs like $λx.((λy.y)x)$, which could have been further reduced to $λx.x$.

In order to evaluate function body, the evaluator must be able to handle inputs with free variables. This introduces a subtle problem. Consider the following example: $$ (λx.(λy.xy))y $$ To reduce this expression, simply substitute the $x$ inside the function body with $y$ will produce the result $λy.yy$, which is incorrect because the input $y$ shouldn’t be bound by $λy$. The correct way to tackle this problem is using alpha equivalence to rename variables inside the function which have the same name with input free variables before substitution: $$ (λx.(λy.xy))y = (λx.(λt.xt))y = λt.yt $$ Here is the implementation of the rename() method, which takes the function and the argument expression as input, and rename the function if needed:

def rename(e1: lambda_Expr, e2: lambda_Expr): lambda_Expr = {
  def rec(e: lambda_Expr, name: String): lambda_Expr = e match {
    case lambda_Name(_) => e
    case lambda_Function(head, body) => 
      if head == name then rec(alpha_equiv(lambda_Function(head, body)), name) else lambda_Function(head, rec(body, name))
    case lambda_Application(e1, e2) => lambda_Application(rec(e1, name), rec(e2, name))
  }
  def iter(tmp: lambda_Expr, fl: List[String]): lambda_Expr = {
    if fl.isEmpty then tmp
    else iter(rec(tmp, fl.head), fl.tail)
  }
  iter(e1, freeList(e2))
}

The alpha_equiv() method takes a function and change its binding name to a unique new name:

def alpha_equiv(f: lambda_Function): lambda_Function = {
  val unique: String = {
    val nl = nameList(f)
    def squote(s: String) = s + "'"
    def find_unique(n: String): String = if nl.contains(n) then find_unique(squote(n)) else n
    find_unique(squote(f.head))
  }
  def rec(e: lambda_Expr): lambda_Expr = {
    if isFree(f.head, e) then e match {
      case lambda_Name(n) => if n == f.head then lambda_Name(unique) else e
      case lambda_Function(head, body) => lambda_Function(head, rec(body))
      case lambda_Application(e1, e2) => lambda_Application(rec(e1), rec(e2))
    }
    else e
  }
  lambda_Function(unique, rec(f.body))
}

The freeList() method and the nameList() method returns all free variables and all variables in an giving expression respectively. Their implementation can be found in the code list.

A complete beta reduction method can then be implemented:

def beta_reduc(e: lambda_Expr): lambda_Expr = e match {
  case lambda_Name(_) => e
  case lambda_Function(_, _) => e
  case lambda_Application(e1, e2) => {
    val func = rename(e1, e2) match 
      case lambda_Function(head, body) => lambda_Function(head, body)
    substitution(func.head, e2, func.body)
  }
}

Evaluation order #

We need a driver to let the beta_reduc() method reduce the input expression repeatedly until we reach a stuck point, where no more reduction can be made. There are different strategies to evaluate lambda expressions. One is called “call-by-value”, which means to evaluate the input argument before it is passed to a function. The other is called “call-by-name”, which means to pass the argument without evaluating it.

We will leave the discussion of the pros and cons of these strategies to future articles. For now, the only thing we should know is the evaluation order doesn’t effect the resulting value of a lambda expression.

This is a call-by-value driver to fully reduce the input expression:

def cbv_driver(e: lambda_Expr): lambda_Expr = e match {
  case lambda_Name(_) => e
  case lambda_Function(head, body) => if isStucked(body) then e else lambda_Function(head, cbv_driver(body))
  case lambda_Application(e1, e2) => {
    val sub1 = if isStucked(e1) then e1 else cbv_driver(e1)
    val sub2 = if isStucked(e2) then e2 else cbv_driver(e2)
    val tmp = lambda_Application(sub1, sub2)
    if isStucked(tmp) then tmp
      else cbv_driver(beta_reduc(tmp))
  }
}

The isStucked() method determines if a given expression can be further reduced:

def isStucked(e: lambda_Expr): Boolean = e match {
  case lambda_Name(_) => true
  case lambda_Function(_, body) => isStucked(body)
  case lambda_Application(e1, e2) => (e1 match {
    case lambda_Name(_) => true
    case lambda_Function(_, _) => false
    case lambda_Application(_, _) => isStucked(e1) 
  }) && isStucked(e2)
}

With this driver, you are already able to evaluate any lambda expressions you want (as long as they’re terminated). Feel free to play with it if you wish!

Example (with sbt console):

scala> val identity = lambda_Function("x", lambda_Name("x"))
scala> val expr1 = lambda_Application(lambda_Function("y", lambda_Application(lambda_Function("y", lambda_Name("y")), lambda_Function("x", lambda_Application(lambda_Name("x"), lambda_Name("y"))))), identity)
scala> cbv_driver(expr1)
val res: lambda_Expr = (lambda x.(x(lambda x.x)))

Summary #

In this article, we have implemented a lambda calculus interpreter (evaluator), but we haven’t reveal the full power of the lambda calculus yet. For now we can only play with lambda expressions which seem not very meaningful. In the next post, we will explore how to implement primitive elements such as natural numbers, conditional test and recursion feature with lambda terms we already have.