F# Web Application on AWS Lambda

I recently built a small F# web app which runs on AWS Lambda. In this post I look at some of the libraries, tooling, and tips which can help you build on the same platform. I used Giraffe (an F# layer on top of ASP.NET Core) as the web framework and DynamoDB as the database. AWS API Gateway receives the web request and hands it off to the Lambda for processing.

Link to this section The Application

A local bar publishes their beer tap list in XML format on their website. I have some ideas for using this data but found the XML file is quite large in size as it contains an entry for every beer they’ve ever had on tap. Rather than my application hitting this XML file directly numerous times a day I thought it better to scrape it daily and serve it over my own API. I built a small web app running on Lambda to do this.

The app has a single RESTful endpoint /tap:

  • POST Scrapes the bar’s website and stores the updated tap list in the database.
  • GET Returns a list of what is being poured from the database.

Link to this section AWS Lambda Dotnet Templates

AWS produce a couple of dotnet extensions which make developing Lambda application much easier.

dotnet new -i Amazon.Lambda.Templates::*

Installs over a dozen Lambda and Serverless application templates. The Lambda templates react to a Lambda event such as a file change in an S3 bucket or a message published to an SQS queue. These look good, but aren’t much help in building a web facing API. The Serverless templates are more the typical web application being served over a Lambda. There’s a template for plain ASP.NET Core applications, but also a template for an F# Giraffe application. This is what I used and made a new project with the following

dotnet new serverless.Giraffe

Out of the box this gives you a simple hello world application hosted on AWS Lambda. More helpfully it includes a default deployment configuration for the AWS Serverless Application Model (SAM, think Cloudformation for serverless apps). Combined with the AWS Lambda Tools dotnet extension (dotnet tool install -g Amazon.Lambda.Tools) your project can be deployed with a simple command.

dotnet lambda deploy-serverless

Running that automatically creates a Cloudformation stack (including the Lambda, API Gateway, and associated configuration), and writes out to the command line with a URL that you can access your application. In just 5 minutes you’ve got a hello world Giraffe website hosted on Lambda.

Link to this section Types

Here I use a Type Provider in order to automatically define the type of the data in the tap list XML file. The Beer and TapList types reflect what I’ll be storing in DynamoDB. The [<HashKey>] annotation is from the DynamoDB package and specifies the primary key of the table.

type BottleList = XmlProvider<"resources/taplist.xml">

type Beer = {
	Name: string
	Volume: string
	Price: string
	ABV: string
	Country: string
	Description: string
}

type TapList = {
	[<HashKey>]
	AddedOn: DateTimeOffset
	TTL: int64
	Beer: Beer[]
}

Link to this section DynamoDB

I’m using the excellent FSharp.AWS.DynamoDB project to interact with the database. Creating and connecting to a table is as easy as

let client = new AmazonDynamoDBClient()
let table = TableContext.Create<TapList>(client, tableName = "levelh", createIfNotExists = true)

I use the Dynamo TTL (time to live) feature in order to automatically delete old data from the database. The TapList entity defines an attribute TTL which stores a timestamp (seconds since epoch). I update the tap list every 24 hours (more on this later), so every time I create a new TapList entity I set the TTL value to be 48 hours in the future. DynamoDB looks at values in this column and will automatically delete any rows which reach their time to live limit. I enabled this feature manually through the AWS Console. Items deleted through the time to live feature don’t use any of the table write capacity.

I couldn’t get my head around the FSharp.AWS.DynamoDB query expression interface so I set up a rather naive way to retrieve the latest data. Instead of fetching the most recent single row from the table I fetch all rows and then sort client side. DynamoDB 101 is to never use the Scan operation, but since my table will contain a few rows (as they are automatically deleted through TTL) it doesn’t really matter.

Below is the entire Http Handler for the GET /tap endpoint.

let latestTapListHandler: HttpHandler =
	fun (next: HttpFunc) (ctx: HttpContext) ->

		let latestTapList =
			table.Scan()
				|> Array.toList
				|> List.sortByDescending (fun x -> x.AddedOn)
				|> List.tryHead

		match latestTapList with
			| Some tapList -> json tapList next ctx
			| None -> json obj next ct

Link to this section Parsing XML

The code to update the tap list in DynamoDB is just as succinct. Getting the latest copy of the tap list is just one line. In development I replaced this function with returning a local copy of this file.

let getTapData =
	Http.RequestString "https://www.hashigozake.co.nz/taplist.xml"

Fetching and parsing the tap list into a strongly typed structure is just 2 lines of code! We then select only the beers that are on tap, convert them to our Beer type using xmlToBeer and then save them in Dynamo.

let updateTapListHandler =
	fun (next : HttpFunc) (ctx : HttpContext) ->

		let parsedTapList = BottleList.Parse(getTapData)

		let pouring =
			parsedTapList.Beers.Products
			|> Array.filter (fun x -> x.Name.String.IsSome) // get rid of any empty elements
			|> Array.filter (fun x -> x.Pouring.String.Value = "Now")
			|> Array.map xmlToBeer

		let tapList = {
			Beer = pouring;
			AddedOn = DateTimeOffset.Now;
			TTL = DateTimeOffset.Now.AddHours(48.0).ToUnixTimeSeconds()
		}

		table.PutItem tapList |> ignore
		text "Beer list updated" next ctx

Link to this section URL Routing

Configuring routing in Giraffe is very elegant.

let webApp:HttpHandler =
    choose [
        route "/tap" >=> choose [
            GET >=> latestTapListHandler
            POST >=> updateTapListHandler
        ]

        setStatusCode 404 >=> text "Not Found" ]

Link to this section Calling the Lambda Periodically

With the application deployed I can do a POST to /tap and it will update the tap list. To automate this I set up a Cloudwatch Event which runs periodically, in this case every 24 hours. Cloudwatch Events typically pass the context of the event to the Lambda it is invoking, for example a JSON blob representing a new file added to an S3 bucket or message published on an SQS queue. We don’t really want to invoke our Lambda as such - it’s running an HTTP server. We just want to do a periodic POST to our /tap endpoint. Because our Lambda sits behind an API Gateway we need to pass the same JSON structure that API Gateway would pass to the Lambda if it received a POST to /tap. So I set up a scheduled Cloudwatch event, invoking the Lambda with the static JSON below.

{
  "body": "",
  "resource": "tap",
  "path": "/tap",
  "httpMethod": "GET",
  "isBase64Encoded": true,
  "queryStringParameters": {},
  "stageVariables": {},
  "requestContext": {
    "accountId": "123456789012",
    "resourceId": "123456",
    "stage": "Prod",
    "requestId": "c6af9ac6-7b61-11e6-9a41-93e8deadbeef",
    "requestTime": "09/Apr/2015:12:34:56 +0000",
    "requestTimeEpoch": 1428582896000,
    "identity": {
      "cognitoIdentityPoolId": null,
      "accountId": null,
      "cognitoIdentityId": null,
      "caller": null,
      "accessKey": null,
      "sourceIp": "127.0.0.1",
      "cognitoAuthenticationType": null,
      "cognitoAuthenticationProvider": null,
      "userArn": null,
      "userAgent": "Custom User Agent String",
      "user": null
    },
    "path": "/tap",
    "resourcePath": "tap",
    "httpMethod": "GET",
    "apiId": "FOO",
    "protocol": "HTTP/1.1"
  }
}

This JSON blob can also be used through the Lambda Testing setup to manually invoke specific endpoints.

Link to this section Summary

That’s all it takes. The dotnet templating tools set up the AWS stack and a hello world F# Giraffe applciation. Add a few type definitions and ~30 lines of logic for the Http Handlers and you’ve got some powerful functionality. Deployment is a single command and there are no servers or software to maintain.

The project source can be viewed on Github.


Related Posts