Code Execution: Eval, Apply

In writing LParen, a Lisp interpreter, the single Nolan-esque mind-bending idea for me has been implementing code execution. I can write a function, which itself is just a data structure, and execute it in the interpreter as a general computation. Characters, symbols, and numbers which previously used to only mean something now do something.

Take these ASCII characters which claim to compute the fibonacci number sequence. It’s only by galactic chance that when these characters are passed into the LParen interpreter that a symbolic object fib is created which lets us compute those numbers.

(define fib 
  (lambda (n)
    (cond ((= n 0) 0)
          ((= n 1) 1)
          (true (+ (fib (- n 1))
                   (fib (- n 2)))))))                   
>> (fib 10)
55

Most Lisps implement code execute through an abstraction known as evaluate and apply. Every byte of code passed into the interpreter runs through eval. When eval encounters a call to a user-defined function (fib above) then it calls apply.

Ancient inscription discovered in a Delphic Temple dated 438 BCE

Ancient inscription discovered in a Delphic Temple dated 438 BCE

This post describes how I implemented eval and apply for LParen. We also look at lexical environments in which functions execute.

This is Part IV in a series of blogs on building a programming language. I’m reading Structure and Interpretation of Computer Programs and making the language based on code samples in the book. Read the first three posts:

Link to this section The Language

LParen is written in F#. The language is defined from this single type Atom.

type Atom =
    | Integer of int
    | Symbol of string
    | Boolean of bool
    | List of Atom list
    | Lambda of Lambda
and
    Lambda = {
        Parameters: Atom list
        Body: Atom
        Environment: Environment
    }
and 
    Environment = {
        Symbols: Dictionary<string, Atom>
        Parent: Environment option
    }

Link to this section Eval

eval is a recursive function which evaluates an Atom in an environment to produce a new Atom. Put formally, it has the type signature:

type Eval = Environment -> Atom -> Atom

eval is recursively applied to every Atom in the source file. I’ve implemented eval as a function containing a single F# match expression. This expression acts as a type-safe dispatcher to ensure all Atoms can be evaluated.

let rec eval: Eval = fun (environment: Environment) (exp: Atom) ->

    match exp with
    | Integer x -> Atom.Integer x
    | Boolean x -> Atom.Boolean x
    | List [Symbol "define"; firstArg; secondArg] -> define firstArg secondArg environment eval
    | List [Symbol "quote"; atom] -> quote atom
    | List [Symbol "lambda"; parameters; body] -> lambda parameters body environment
    | List [Symbol "="; firstArg; secondArg] -> atomEquality firstArg secondArg
    ...

Primitives such as integers and booleans evaluate to themselves. The language predefines a number of functions and Scheme special forms such as define, quote, lambda, etc. These are specifically pattern matched within eval. These branches must also return an Atom.

= is one of the predefined functions – there’s an explicit branch for it in eval, and the logic is implemented in F#. When the evaluator encounters the = function it does an equality comparison and returns an Atom.Boolean.

let atomEquality (x: Atom) (y:Atom) =
    match (x, y) with
    | (Integer x, Integer y) -> if x = y then Atom.Boolean true else Atom.Boolean false
    | _ -> failwith "= only supports integers"

That’s very well, but it’s not general computation. We want the language to be able to define its own functions such as the fibonacci generator.

(define fib 
  (lambda (n)
    (cond ((= n 0) 0)
          ((= n 1) 1)
          (true (+ (fib (- n 1))
                   (fib (- n 2)))))))

Like all code that’s executed by the interpreter, the above block runs through eval. Because functions are just data structures there’s no magic going on.

define associates a symbol with a value.

let define (symbol: Atom) (value: Atom) (environment: Environment) (eval: Eval): Atom =
        
    let symbolName =
        match symbol with
        | Symbol s -> s
        | _ -> failwith $"define expects the symbol name to be an Atom.Symbol. Instead got value: {firstArg}"
        
    let value = secondArg   // 100
    environment.Symbols[symbolName] <- eval value environment
    symbol

In this case the value is a lambda. When the lambda block is executed it returns an Atom.Lambda. Functions are just data structures with a list of parameters, a body of type Atom, and an environment to be executed in.

let lambda (parameters: Atom) (body: Atom) (environment: Environment) =
    
    let parsedParameters =
        match parameters with
        | List atoms -> atoms
        | _ -> failwith $"Expected lambda parameters to be a list of symbols"
    
    Atom.Lambda {
        Parameters = parsedParameters
        Body = body
        Environment = environment
    }

So far I’ve demonstrated how the evaluator works on primitive values and the set of predefined functions. Symbols such as =, define, and lambda are hard-coded into the interpreter. Executing the fib definition block updated the environment with a symbol fib associated with a lambda value. When we evaluate code (fib 10) the evaluator must know how to call the function. It does this using apply.

Link to this section Apply

Functions are called through the special form apply.

>> (apply fib 10)
55

Calling functions is so common in Lisp that there’s a convention that unquoted lists get treated as function calls.

>> (fib 10)
55

In practice, the above call runs through the same execution codepath as apply. Like =, define, and lambda, apply is a dispatched within the evaluator’s match statement.

let rec eval: Eval = fun (environment: Environment) (exp: Atom) ->

    match exp with
    | Integer x -> Atom.Integer x
    | Boolean x -> Atom.Boolean x
    | List [Symbol "define"; firstArg; secondArg] -> define firstArg secondArg environment eval
    | List [Symbol "quote"; atom] -> quote atom
    | List [Symbol "lambda"; parameters; body] -> lambda parameters body environment
    | List [Symbol "="; firstArg; secondArg] -> atomEquality firstArg secondArg
    | List x when x.Head = Symbol "apply" -> applyForm x.Tail eval environment // explicit (apply fib 10)
    ...
    | List x -> applyForm x eval environment // implicit (fib 10)
    ...

Apply has a touch of alchemy to it.

let applyForm (parameters: Atom list) (eval: Eval) (environment: Environment): Atom =
    
    let symbolOfCallable = parameters.Head
    let argsOfCallable = parameters.Tail
    
    // find the lambda we predefined lambda
    let callable =
        match find symbolOfCallable environment with
        | Some symbol -> symbol
        | None -> failwith $"Could not locate callable symbol {symbolOfCallable}"
    
    // evaluate the arguments being passed into the function call, these may themselves be lambda calls    
    let evaluatedArgs =
        argsOfCallable
        |> List.map (fun atom -> eval environment atom)
    
    // when executing a lambda we create its own environment (execution context) so that
    // symbols local to the lambda don't pollute anything higher in the stack.
    // we assign the keys of the lambda's parameter symbols with the evaluatedArgs passed
    // in the function call. we match these by order then call the function itself
    let lambdaEnvironment = createEnvironmentWithParent environment
    
    match callable with
    | Lambda lambda ->
        
        List.zip lambda.Parameters evaluatedArgs
        |> List.iter (fun (parameter, value) ->
            let parameterName =
                match parameter with      // add
                | Symbol s -> s
                | _ -> failwith $"Lambda parameter Symbols are expected to be Atom.Symbol. Instead got: {parameter} = {value}"
            lambdaEnvironment.Symbols[parameterName] <- value)
        
        eval lambdaEnvironment lambda.Body
    | _ -> failwith $"Symbol {symbolOfCallable} is not callable"

The first few statements make sure the symbol is a callable Lambda, evaluate the arguments that are being passed to it, and setup the execution environment with appropriately named symbols. Finally the body of the Lambda itself is run through eval. That data structure, the source code of the lambda, is being brought to life. It feels like it shouldn’t work. But since eval can evaluate:

  1. Primitive values - Integers, booleans, etc.
  2. The pre-defined set of special forms and functions - =, define, lambda, …
  3. User-defined functions

And the set of those 3 are exactly what makes up fib or any other user-defined function. It works.

Link to this section Lexical Environments

Environments bind symbols to values. (define x 100) writes to the environment and (+ x 10) reads from it.

type Environment = {
    Symbols: Dictionary<string, Atom>
    Parent: Environment option
}

When I first implemented the language there was a single global environment. This meant that a symbol could only have one value at any point in time. When I added function execution things got a bit funny. Recursive functions did not work.

(define fib 
  (lambda (n)
    (cond ((= n 0) 0)
          ((= n 1) 1)
          (true (+ (fib (- n 1))
                   (fib (- n 2)))))))

We read that computation and see there is only one variable symbol – n. It’s not intuitive that there are actually many n’s, each holding different values at different levels of the recursion.

Today, nearly all programming languages use lexical scoping. This binds a symbol to the code block it was defined in. Implementing this was straight forward. When applyForm is executing it creates a new environment in which to bind symbols. It also contains a pointer to the parent environment so that it’s possible to access symbols defined at a higher level.

As an experiment I thought about how to execute recursive code if there was only a single global environment. The interpreter would need to dynamically re-write the function, turning n into a unique n-at-recursion-level-{x} symbol. That turns it into some automatic memoizer, a technique commonly used today in recursive algorithms.

Link to this section Conclusion

In 2022 I read George Dyson’s Darwin Among the Machines and Turing’s Cathedral. Both tell the brilliant history of computing, in both theory and practice. On the mathematical theory side we learn the histories and insights folks such as Leibniz, Church, Godel, Turing made; and in implementation the work lead by John von Neumann and Julian Bigelow. These books inspired me to write my own computation engine which took the form of LParen – a Scheme-like language interpreter. This project predictably expanded by knowledge on the theory of computation, but unexpectedly that of abstractions. Recursion has been a common theme across this blog series, and it strikes once again with abstractions. In writing a programming language you must decide on the abstractions to provide the user so that they can implement their own abstractions in the language. This project has made me think in ways I just don’t come across in my day job.


Related Posts