I recently built a small F# web app which runs on AWS Lambda. In this post I look at some of the libraries, tooling, and tips which can help you build on the same platform. I used Giraffe (an F# layer on top of ASP.NET Core) as the web framework and DynamoDB as the database. AWS API Gateway receives the web request and hands it off to the Lambda for processing.
The Application
A local bar publishes their beer tap list in XML format on their website. I have some ideas for using this data but found the XML file is quite large in size as it contains an entry for every beer they’ve ever had on tap. Rather than my application hitting this XML file directly numerous times a day I thought it better to scrape it daily and serve it over my own API. I built a small web app running on Lambda to do this.
The app has a single RESTful endpoint /tap
:
POST
Scrapes the bar’s website and stores the updated tap list in the database.GET
Returns a list of what is being poured from the database.
AWS Lambda Dotnet Templates
AWS produce a couple of dotnet extensions which make developing Lambda application much easier.
dotnet new -i Amazon.Lambda.Templates::*
Installs over a dozen Lambda and Serverless application templates. The Lambda templates react to a Lambda event such as a file change in an S3 bucket or a message published to an SQS queue. These look good, but aren’t much help in building a web facing API. The Serverless templates are more the typical web application being served over a Lambda. There’s a template for plain ASP.NET Core applications, but also a template for an F# Giraffe application. This is what I used and made a new project with the following
dotnet new serverless.Giraffe
Out of the box this gives you a simple hello world application hosted on AWS Lambda. More helpfully it includes a default deployment configuration for the AWS Serverless Application Model (SAM, think Cloudformation for serverless apps). Combined with the AWS Lambda Tools dotnet extension (dotnet tool install -g Amazon.Lambda.Tools
) your project can be deployed with a simple command.
dotnet lambda deploy-serverless
Running that automatically creates a Cloudformation stack (including the Lambda, API Gateway, and associated configuration), and writes out to the command line with a URL that you can access your application. In just 5 minutes you’ve got a hello world Giraffe website hosted on Lambda.
Types
Here I use a Type Provider in order to automatically define the type of the data in the tap list XML file. The Beer
and TapList
types reflect what I’ll be storing in DynamoDB. The [<HashKey>]
annotation is from the DynamoDB package and specifies the primary key of the table.
type BottleList = XmlProvider<"resources/taplist.xml">
type Beer = {
Name: string
Volume: string
Price: string
ABV: string
Country: string
Description: string
}
type TapList = {
[<HashKey>]
AddedOn: DateTimeOffset
TTL: int64
Beer: Beer[]
}
DynamoDB
I’m using the excellent FSharp.AWS.DynamoDB project to interact with the database. Creating and connecting to a table is as easy as
let client = new AmazonDynamoDBClient()
let table = TableContext.Create<TapList>(client, tableName = "levelh", createIfNotExists = true)
I use the Dynamo TTL (time to live) feature in order to automatically delete old data from the database. The TapList
entity defines an attribute TTL
which stores a timestamp (seconds since epoch). I update the tap list every 24 hours (more on this later), so every time I create a new TapList
entity I set the TTL value to be 48 hours in the future. DynamoDB looks at values in this column and will automatically delete any rows which reach their time to live limit. I enabled this feature manually through the AWS Console. Items deleted through the time to live feature don’t use any of the table write capacity.
I couldn’t get my head around the FSharp.AWS.DynamoDB
query expression interface so I set up a rather naive way to retrieve the latest data. Instead of fetching the most recent single row from the table I fetch all rows and then sort client side. DynamoDB 101 is to never use the Scan operation, but since my table will contain a few rows (as they are automatically deleted through TTL) it doesn’t really matter.
Below is the entire Http Handler for the GET /tap
endpoint.
let latestTapListHandler: HttpHandler =
fun (next: HttpFunc) (ctx: HttpContext) ->
let latestTapList =
table.Scan()
|> Array.toList
|> List.sortByDescending (fun x -> x.AddedOn)
|> List.tryHead
match latestTapList with
| Some tapList -> json tapList next ctx
| None -> json obj next ct
Parsing XML
The code to update the tap list in DynamoDB is just as succinct. Getting the latest copy of the tap list is just one line. In development I replaced this function with returning a local copy of this file.
let getTapData =
Http.RequestString "https://www.hashigozake.co.nz/taplist.xml"
Fetching and parsing the tap list into a strongly typed structure is just 2 lines of code! We then select only the beers that are on tap, convert them to our Beer
type using xmlToBeer
and then save them in Dynamo.
let updateTapListHandler =
fun (next : HttpFunc) (ctx : HttpContext) ->
let parsedTapList = BottleList.Parse(getTapData)
let pouring =
parsedTapList.Beers.Products
|> Array.filter (fun x -> x.Name.String.IsSome) // get rid of any empty elements
|> Array.filter (fun x -> x.Pouring.String.Value = "Now")
|> Array.map xmlToBeer
let tapList = {
Beer = pouring;
AddedOn = DateTimeOffset.Now;
TTL = DateTimeOffset.Now.AddHours(48.0).ToUnixTimeSeconds()
}
table.PutItem tapList |> ignore
text "Beer list updated" next ctx
URL Routing
Configuring routing in Giraffe is very elegant.
let webApp:HttpHandler =
choose [
route "/tap" >=> choose [
GET >=> latestTapListHandler
POST >=> updateTapListHandler
]
setStatusCode 404 >=> text "Not Found" ]
Calling the Lambda Periodically
With the application deployed I can do a POST to /tap
and it will update the tap list. To automate this I set up a Cloudwatch Event which runs periodically, in this case every 24 hours. Cloudwatch Events typically pass the context of the event to the Lambda it is invoking, for example a JSON blob representing a new file added to an S3 bucket or message published on an SQS queue. We don’t really want to invoke our Lambda as such - it’s running an HTTP server. We just want to do a periodic POST to our /tap
endpoint. Because our Lambda sits behind an API Gateway we need to pass the same JSON structure that API Gateway would pass to the Lambda if it received a POST to /tap
. So I set up a scheduled Cloudwatch event, invoking the Lambda with the static JSON below.
{
"body": "",
"resource": "tap",
"path": "/tap",
"httpMethod": "GET",
"isBase64Encoded": true,
"queryStringParameters": {},
"stageVariables": {},
"requestContext": {
"accountId": "123456789012",
"resourceId": "123456",
"stage": "Prod",
"requestId": "c6af9ac6-7b61-11e6-9a41-93e8deadbeef",
"requestTime": "09/Apr/2015:12:34:56 +0000",
"requestTimeEpoch": 1428582896000,
"identity": {
"cognitoIdentityPoolId": null,
"accountId": null,
"cognitoIdentityId": null,
"caller": null,
"accessKey": null,
"sourceIp": "127.0.0.1",
"cognitoAuthenticationType": null,
"cognitoAuthenticationProvider": null,
"userArn": null,
"userAgent": "Custom User Agent String",
"user": null
},
"path": "/tap",
"resourcePath": "tap",
"httpMethod": "GET",
"apiId": "FOO",
"protocol": "HTTP/1.1"
}
}
This JSON blob can also be used through the Lambda Testing setup to manually invoke specific endpoints.
Summary
That’s all it takes. The dotnet templating tools set up the AWS stack and a hello world F# Giraffe applciation. Add a few type definitions and ~30 lines of logic for the Http Handlers and you’ve got some powerful functionality. Deployment is a single command and there are no servers or software to maintain.
The project source can be viewed on Github.