A Simple Computation Engine in F#

I’m in one of those reflective periods where I’m catching up on books I should have read earlier; Thomas Pynchon, Neal Stephenson, and the like. In Software Engineering one of those books is Structure and Interpretation of Computer Programs (SICP). SICP is known for teaching computing from basic principles using the programming language Scheme, (a (type of (lisp))). I began programming using Javascript, PHP, and Java. The first Lisp I encountered was Clojure, and I found the language and paradigm to be radically different from anything I’d ever written. I wrote some hobby projects in Clojure, thought that I had reached that mythical enlightenment every programmer is supposed to have with Lisp, and then switched back to Python. In retrospect I see I had fooled myself into thinking I grokked Lisp. All I had done was port some Python programs into Clojure, preserving the Pythonic paradigm and even code layout. Now I’m coming back to Lisp because I think there’s more to learn.

I’m interested in the theory of computation – what can be solved by computers? How can we represent computations? and how should we think about it? Neal Stephenson’s Baroque Cycle is a historical fiction set in the decades surrounding the year 1680. We see members of the Royal Society working towards developing a Philosophical Language so that all concepts and ideas can be expressed concisely – finally man will no longer be misunderstood! We’re still waiting for that one. In this Philosophical Language there would be only one way to communicate an idea. Representing ideas in a single way is something we strive to do in Software Engineering. In large code bases we often see multiple implementations of the same abstraction, and use terms such as conciseness and correctness to evaluate the quality of the code we write. The Baroque Cycle also features Leibniz working on his theory of computation and continuing a Twitter-feud with Newton.

Structure and Interpretation of Computer Programs opens with:

We are about to study the idea of a computational process. Computational processes are abstract beings that inhabit computers. As they evolve, processes manipulate other abstract things called data. The evolution of a process is directed by a pattern of rules called a program. People create programs to direct processes.

Programmers are alchemists, lines of code are our metals, abstractions are our enchantments, and all we produce are dull and complex systems.

Basic Computation

To develop my understanding of Lisp and Computation I’m writing a programming language which can run the code samples in SICP. It’s intentionally not a Scheme interpreter, as it’s still the 1690s and the Scheme Standard hasn’t been written yet. Instead, I’ll use the concepts and code samples discussed in the book to determine how the language operates. In this way the language will evolve bottom up.

We start by writing a basic calculator, a computer program that can execute the predefined computations of addition and subtraction.

Pages 5 and 6 of SICP introduce the first code. The table below shows the values we want to be able to evaluate.

Input Output
486 486
(+ 137 439) 486
(+ (+ 3 5) (- 10 6 0)) 12

The main syntax of Lisp are S-Expressions. These are paranthesised expressions such as (+ (+ 3 5) (- 10 6)). All computations and data of the language are written in S-Expressions, giving the language a notion of simplicity. This isn’t like Rust which requires a cheatsheet of the language tattooed on your arm.

The entire language is represented by this Atom data type.

type Atom =
    | Integer of int
    | Symbol of string
    | List of Atom list

Parsing

To parse the language we’ll use the parser combinator library FParsec. We start by defining small parsers for our primitive expressions, then compose these into more complex parsers to build the full programming language.

Where Integer represents the primitive numbers, Symbol is either + or -, and List is the S-Expression made up of many Atom. Note List is a recursive definition so that S-Expressions may be embedded within each other.

Parser combinators make defining the logic for parsers very simple to understand.

let parseSymbol =
    (pstring "+" >>% Atom.Symbol "+") <|>
    (pstring "-" >>% Atom.Symbol "-")
    <?> "Could not parse symbol"

let parseInteger = pint32 |>> Atom.Integer

let parseList = 
    skipChar '(' >>. spaces >>. 
    (attempt (sepBy atomValue spaces1) <|> sepEndBy1 atomValue spaces1)
    .>> spaces .>> skipChar ')' 
    |>> Atom.List

let atomValue = choice [ 
    parseSymbol
    parseInteger
    parseList
]

Running the full language parsers over inputs we get the following values:

Input Parser Output
486 Integer 486
(+ 137 439) List [Symbol "+"; Integer 137; Integer 439]
(+ (+ 3 5) (- 10 6 0)) List [Symbol "+"; List [Symbol "+"; Integer 3; Integer 5]; List [Symbol "-"; Integer 10; Integer 6; Integer 0]]

Evaluation

The language gets parsed into a nested tree. S-Expressions may contain other S-Expressions which may contain other S-Expressions &c .

List [Symbol "+"; 
        List [Symbol "+"; Integer 3; Integer 5]; 
        List [Symbol "-"; Integer 10; Integer 6]
     ]

For simplicity and aesthetics, we’ll write our evaluator as a recursive function. The evaluator is applied to Atoms one at a time, and also returns an Atom.

Of note below is the List branch of Line 6. In this simple calculation language all S-Expressions are computations. When eval encounters an S-Expression it expects the first item in the list to be the operator (Symbol "+" or Symbol "-"), and the remaining items to be the operands which are to be added or subtracted. These operands aren’t necessarily going to be Atom.Integer, they may be an Atom.List! In which case we need to evaluate that sub expression before continuing with the mathematical operation, hence the recursive eval applied to every operand on Line 14.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
let rec eval(exp: Atom): Atom =

    match exp with
    | Integer x -> Atom.Integer x
    | Symbol x -> Atom.Symbol x
    | List x ->
        let operator =
            match x.Head with
            | Symbol "+" -> (+)
            | Symbol "-" -> (-)
            | _ -> failwith $"Operator {x.Head} unsupported"
            
        x.Tail
        |> List.map eval
        |> List.reduce operator

Recall that this calculator language is constructed by the Atom datatype.

type Atom =
    | Integer of int
    | Symbol of string
    | List of Atom list

F# as the host language knows how to add and subtract numbers, but it doesn’t know how to add and subtract Atoms. In the eval snippet above it will call F#’s built in + and - functions. We must define the computation logic for these operations.

let inline (+) (x: Atom) (y:Atom) =
    match (x, y) with
    | (Integer x, Integer y) -> Atom.Integer (x + y)
    | _ -> failwith "Add only supports Integers"

let inline (-) (x: Atom) (y:Atom) =
    match (x, y) with
    | (Integer x, Integer y) -> Atom.Integer (x - y)
    | _ -> failwith "Subtract only supports Integers"

Assemble

We have a parser which turns a string of characters into an expression tree. We have an evaluator which applies computations to a valid expression tree. Let’s unite them into a computer program we can execute.

while true do
    printf ">> " 
    let input = Console.ReadLine()
        
    parser input
    |> Result.map eval
    |> Result.map (fun s -> printfn $"{s}")
    |> Result.mapError failwith
    |> ignore        

Does it work? This wouldn’t be a particularly good blog if it didn’t.

logan@anon scripts % dotnet fsi 01-ints-and-symbols.fsx
>> 486
Integer 486
>> (+ 137 439)
Integer 576
>> (+ (+ 3 5) (- 10 6 0))
Integer 12

Full source code and an executable script is available here.

Close Paren

In those ~70 lines of source code we’ve defined a programming language that can execute arbitrary addition and subtraction routines. We can express simple math operations, but we can’t express our own abstractions. In Implementing General Computation we explore the idea of abstraction and implement functions.


Related Posts