F# Web Application on AWS Lambda

July 2019 · 6 minute read · tech fsharp linux dotnet aws dynamo

I recently built a small F# web app which runs on AWS Lambda. In this post I look at some of the libraries, tooling, and tips which can help you build on the same platform. I used Giraffe (an F# layer on top of ASP.NET Core) as the web framework and DynamoDB as the database. AWS API Gateway receives the web request and hands it off to the Lambda for processing.

The Application

A local bar publishes their beer tap list in XML format on their website. I have some ideas for using this data but found the XML file is quite large in size as it contains an entry for every beer they’ve ever had on tap. Rather than my application hitting this XML file directly numerous times a day I thought it better to scrape it daily and serve it over my own API. I built a small web app running on Lambda to do this.

The app has a single RESTful endpoint /tap:

  • POST Scrapes the bar’s website and stores the updated tap list in the database.
  • GET Returns a list of what is being poured from the database.

AWS Lambda Dotnet Templates

AWS produce a couple of dotnet extensions which make developing Lambda application much easier.

dotnet new -i Amazon.Lambda.Templates::*

Installs over a dozen Lambda and Serverless application templates. The Lambda templates react to a Lambda event such as a file change in an S3 bucket or a message published to an SQS queue. These look good, but aren’t much help in building a web facing API. The Serverless templates are more the typical web application being served over a Lambda. There’s a template for plain ASP.NET Core applications, but also a template for an F# Giraffe application. This is what I used and made a new project with the following

dotnet new serverless.Giraffe

Out of the box this gives you a simple hello world application hosted on AWS Lambda. More helpfully it includes a default deployment configuration for the AWS Serverless Application Model (SAM, think Cloudformation for serverless apps). Combined with the AWS Lambda Tools dotnet extension (dotnet tool install -g Amazon.Lambda.Tools) your project can be deployed with a simple command.

dotnet lambda deploy-serverless

Running that automatically creates a Cloudformation stack (including the Lambda, API Gateway, and associated configuration), and writes out to the command line with a URL that you can access your application. In just 5 minutes you’ve got a hello world Giraffe website hosted on Lambda.

Types

Here I use a Type Provider in order to automatically define the type of the data in the tap list XML file. The Beer and TapList types reflect what I’ll be storing in DynamoDB. The [<HashKey>] annotation is from the DynamoDB package and specifies the primary key of the table.

type BottleList = XmlProvider<"resources/taplist.xml">

type Beer = {
	Name: string
	Volume: string
	Price: string
	ABV: string
	Country: string
	Description: string
}

type TapList = {
	[<HashKey>]
	AddedOn: DateTimeOffset
	TTL: int64
	Beer: Beer[]
}

DynamoDB

I’m using the excellent FSharp.AWS.DynamoDB project to interact with the database. Creating and connecting to a table is as easy as

let client = new AmazonDynamoDBClient()
let table = TableContext.Create<TapList>(client, tableName = "levelh", createIfNotExists = true)

I use the Dynamo TTL (time to live) feature in order to automatically delete old data from the database. The TapList entity defines an attribute TTL which stores a timestamp (seconds since epoch). I update the tap list every 24 hours (more on this later), so every time I create a new TapList entity I set the TTL value to be 48 hours in the future. DynamoDB looks at values in this column and will automatically delete any rows which reach their time to live limit. I enabled this feature manually through the AWS Console. Items deleted through the time to live feature don’t use any of the table write capacity.

I couldn’t get my head around the FSharp.AWS.DynamoDB query expression interface so I set up a rather naive way to retrieve the latest data. Instead of fetching the most recent single row from the table I fetch all rows and then sort client side. DynamoDB 101 is to never use the Scan operation, but since my table will contain a few rows (as they are automatically deleted through TTL) it doesn’t really matter.

Below is the entire Http Handler for the GET /tap endpoint.

let latestTapListHandler: HttpHandler =
	fun (next: HttpFunc) (ctx: HttpContext) ->

		let latestTapList =
			table.Scan()
				|> Array.toList
				|> List.sortByDescending (fun x -> x.AddedOn)
				|> List.tryHead

		match latestTapList with
			| Some tapList -> json tapList next ctx
			| None -> json obj next ct

Parsing XML

The code to update the tap list in DynamoDB is just as succinct. Getting the latest copy of the tap list is just one line. In development I replaced this function with returning a local copy of this file.

let getTapData =
	Http.RequestString "https://www.hashigozake.co.nz/taplist.xml"

Fetching and parsing the tap list into a strongly typed structure is just 2 lines of code! We then select only the beers that are on tap, convert them to our Beer type using xmlToBeer and then save them in Dynamo.

let updateTapListHandler =
	fun (next : HttpFunc) (ctx : HttpContext) ->

		let parsedTapList = BottleList.Parse(getTapData)

		let pouring =
			parsedTapList.Beers.Products
			|> Array.filter (fun x -> x.Name.String.IsSome) // get rid of any empty elements
			|> Array.filter (fun x -> x.Pouring.String.Value = "Now")
			|> Array.map xmlToBeer

		let tapList = {
			Beer = pouring;
			AddedOn = DateTimeOffset.Now;
			TTL = DateTimeOffset.Now.AddHours(48.0).ToUnixTimeSeconds()
		}

		table.PutItem tapList |> ignore
		text "Beer list updated" next ctx

URL Routing

Configuring routing in Giraffe is very elegant.

let webApp:HttpHandler =
    choose [
        route "/tap" >=> choose [
            GET >=> latestTapListHandler
            POST >=> updateTapListHandler
        ]

        setStatusCode 404 >=> text "Not Found" ]

Calling the Lambda Periodically

With the application deployed I can do a POST to /tap and it will update the tap list. To automate this I set up a Cloudwatch Event which runs periodically, in this case every 24 hours. Cloudwatch Events typically pass the context of the event to the Lambda it is invoking, for example a JSON blob representing a new file added to an S3 bucket or message published on an SQS queue. We don’t really want to invoke our Lambda as such - it’s running an HTTP server. We just want to do a periodic POST to our /tap endpoint. Because our Lambda sits behind an API Gateway we need to pass the same JSON structure that API Gateway would pass to the Lambda if it received a POST to /tap. So I set up a scheduled Cloudwatch event, invoking the Lambda with the static JSON below.

{
  "body": "",
  "resource": "tap",
  "path": "/tap",
  "httpMethod": "GET",
  "isBase64Encoded": true,
  "queryStringParameters": {},
  "stageVariables": {},
  "requestContext": {
    "accountId": "123456789012",
    "resourceId": "123456",
    "stage": "Prod",
    "requestId": "c6af9ac6-7b61-11e6-9a41-93e8deadbeef",
    "requestTime": "09/Apr/2015:12:34:56 +0000",
    "requestTimeEpoch": 1428582896000,
    "identity": {
      "cognitoIdentityPoolId": null,
      "accountId": null,
      "cognitoIdentityId": null,
      "caller": null,
      "accessKey": null,
      "sourceIp": "127.0.0.1",
      "cognitoAuthenticationType": null,
      "cognitoAuthenticationProvider": null,
      "userArn": null,
      "userAgent": "Custom User Agent String",
      "user": null
    },
    "path": "/tap",
    "resourcePath": "tap",
    "httpMethod": "GET",
    "apiId": "FOO",
    "protocol": "HTTP/1.1"
  }
}

This JSON blob can also be used through the Lambda Testing setup to manually invoke specific endpoints.

Summary

That’s all it takes. The dotnet templating tools set up the AWS stack and a hello world F# Giraffe applciation. Add a few type definitions and ~30 lines of logic for the Http Handlers and you’ve got some powerful functionality. Deployment is a single command and there are no servers or software to maintain.

The project source can be viewed on Github.


AWS Developer Associate Exam 2019

June 2019 · 17 minute read · tech aws cloud

I recently got the AWS Developer Associate certification. Here are some tips and a copy of my study notes.

Read the AWS Exam Guide. It’s a 3 pager by AWS describing exactly what is in the test.

Purchase the Whizlabs Practice Exams. A few people had recommended these to me and I found the questions a good way to study. All questions have a detailed explanation of the correct answer and reasons why the other options weren’t correct. If you start the exam in ‘practice mode’ you can review the explanation to the question immediately after you have answered it rather than waiting until the end of the exam. You can re-take these any number of times.

I did a couple of the Whizlabs practice exams and quickly identified a couple of areas I was lacking in - AWS Security and the AWS code deployment CI/CD services. I studied these by reading AWS Whitepapers, AWS Product FAQs, and the AWS product overview pages. The aws.training website also has a bunch of good learning resources.

I recommend doing one Whizlabs practice exam per day and then studying your weak spots. Repeat this until you’re confident! I’ve copied my notes below. I think they’re all relevant for content covered in the exam but there are definitely some gaps.


Misc

An AWS region has at least 2 availability zones (AZs).

  • High Availability - Application runs across 2 or more AZs. eg. uses globally replicate services.
  • Fault Tolerant - Application can self heal when failure occurs. eg. auto scaling group brings up a new instance to replace a dead one.
  • Auto Scaling - Launch and terminate EC2 instances/containers based on a scaling plan.
    • Manual Scaling, Scheduled Scaling, Dynamic Scaling (responds to Cloudwatch events).

Shared Responsibility Model - AWS is responsible for security of the cloud. Customers are responsible for security in the cloud. eg.

  • Customers: Platform, identity, data, OS updates, network configuration, encryption, …
  • AWS: Physical security, data centres, hypervisor updates, implementation of managed services, …

AWS Root account should not be used for logging in or making changes. Instead create a new user in the AWS console, create an administrator group with all privileges, and put the user in the administrator group. Use this user.

  • Login to the management console with username, password, and MFA.
  • CLI and SDKs should use access keys.

Credentials and access keys are stored unencrypted in ~/.aws/credentials/ so don’t store any root credentials there. Credentials to use are selected in the following order.

  1. Embedded in code
  2. Environment variables
  3. ~.aws/credentials/
  4. IAM role assigned to EC2 instance

Amazon Resource Name (ARN) - Unique Identifier for every resource on AWS.

  • eg arn:aws:dynamodb:us-west-2:3897429234:table/accounts
  • Only one wildcard (*) can be used in an ARN.

AWS errors

  • 400 - Client error, bad request.
  • 500 - Server error, something to do with AWS. Retry after exponential backoff.

HTTP headers and prefixed with x-amz-.

AWS Envelope Encryption - Data is encrypted by a plain text data key which is further encrypted using the plain text master key.

Elastic Load Balancing

  • One of Application Load Balancer, Network Load Balancer, Classic Load Balancer.
  • Distributes traffic across multiple EC2 instances/containers/lambdas across AZs.
  • Can do healthchecks on instances and stop routing traffic to those which fail.

Developing on AWS

Cloud 9

  • Browser based IDE. Integrates well with Lambda and other AWS services.

X-Ray

  • Used to identify errors, bugs, and performance bottlenecks in running applications.
  • Visualise Lambda pipeline workflow.
  • Trace requests through lambda functions and other AWS services such as DynamoDB.

Management Tools

Cloud Watch

  • Collects and tracks AWS resource metrics. Users can specify custom metrics to collect.
  • Set alarms to respond to events.
    • eg. Monitoring load on EC2 instances. Set a Cloudwatch alert if an instance has a high load sustained over 5 minutes. Triggers an auto scaling policy to spin up new EC2 instance.
    • Use high resolution custom alarms for measuring events which occur more frequently than 30 seconds and need to be acted upon.

Cloud Trail

  • Logs all API calls against your AWS account.
    • This includes all actions through the AWS Management Console, SDKs, and API.
    • Captures who, when, what was requested, the response including the before state (eg. instance running), and the current state (eg. instance stopping).
  • Logs can be streamed to an S3 bucket, otherwise visible in the AWS Console for 90 days.

Identity Access Management (IAM)

  • Define permissions to control which AWS resources users can access.

    PERMANENT                         TEMPORARY
    
    +------+         +-------+       +------+
    | User +-------> | Group |       | Role |
    +--+---+         +---+---+       +------+
    |                 |
    |                 |              ^
    |                 |              |
    |                 |              |
    |           +-----+------+       |
    +-----------+ IAM Policy +-------+
                +------------+
    
  • Create IAM policies which grant groups or roles the privileges required in order to fulfill their role.

    • Roles are the preferred way of granting authorization to perform a task.
    • Roles can be used to temporarily give an existing IAM user privileges. This user could be from a different AWS account.
    • Only one role can be assumed at a time.
  • Within IAM there are Identity based and Resource based permissions. Use both together for security.

    • Identity based permissions - Applied to a user. eg. User Joe can read items in this S3 bucket.
    • Resource based permissions- Applied to a resource. eg This S3 bucket can be read by user Scott.
  • IAM Policies are either Managed or Inline.

    • Inline Policy - Applied to a single resource, cannot be shared.
    • Can be copied between resources but changes to the policy of one will not apply to others.
    • Managed Policy - Standalone Identity-based policy which can be shared between the resources it’s assigned to.
    • AWS Managed Policy - Very high level. eg. Can read from S3.
    • Customer Managed Policy - Can be more specific. eg. at the bucket or object level.
    • Managed policies are re-usable, have central change management, and support versioning + rollback.

IAM Policy Evaluation

+----------------------------------+
| Evaluate all applicable policies |
+--------+-------------------------+
         |
         |
         v          Yes

 +----------------+        +------+
 | Explicit Deny? +----->  | Deny |
 +-------+--------+        +------+
         |
   No    |
         v          Yes

 +-----------------+      +-------+
 | Explicit Allow? +----> | Allow |
 +-------+---------+      +-------+
         |
   No    |
         v

      +------+
      | Deny |
      +------+
  • By default all requests are denied.
  • An explicit allow overrides this default.
  • An explicit deny overrides any allow.
  • Execution order of policies does not matter.

Example IAM Policy

This example shows how you might create a policy that allows Read and Write access to objects in a specific S3 bucket. This policy grants the permissions necessary to complete this action from the AWS API or AWS CLI only.

The s3:*Object action uses a wildcard as part of the action name. The AllObjectActions statement allows the GetObject, DeleteObject, PutObject, and any other Amazon S3 action that ends with the word “Object”.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ListObjectsInBucket",
            "Effect": "Allow",
            "Action": ["s3:ListBucket"],
            "Resource": ["arn:aws:s3:::bucket-name"]
        },
        {
            "Sid": "AllObjectActions",
            "Effect": "Allow",
            "Action": "s3:*Object",
            "Resource": ["arn:aws:s3:::bucket-name/*"]
        }
    ]
}

Storage, S3

Buckets are created inside regions, not AZs.

  • Objects are automatically replicated across 3 AZs within the region.
  • Bucket names must be globally unique.
  • Buckets must be empty before they can be deleted.
  • Buckets can be encrypted with:
    • S3 managed keys (SSE-S3)
    • KMS managed keys (SSE-KMS)
    • Customer provided keys (SSE-C)

S3 ACLs can be applied to buckets and individual objects.

  • These are not managed through IAM.
  • Can apply IAM-like policies directly to buckets.

Best practices:

  • Avoid un-neccesary requests by setting metadata before uploading objects.
  • Cache bucket and key names within your app.
  • If serving files over HTTP use the Cloudfront CDN to lower latency and reduce costs.
  • When uploading large files use the MultipartUpload API.

Data Storage Services

All are managed services.

  • Redshift - Data warehouse. Query with SQL. Integrates with BI packages.
  • RDS - Relational databases - MySQL, PostgreSQL, SQLServer, Oracle, MariaDB, Aurora.
    • Aurora - AWS developed SQL database compatible with MySQL and Postgres clients.
  • Neptune - Graph database.
  • DynamoDB - NoSQL database.

DynamoDB

  • Low latency NoSQL database.
  • Uses IAM for permissions and access.
  • Backup, restore, point in time recovery.
  • Tables have items. Items have attributes.
  • Tables must have one primary partition key, and many sort keys.
    • Types of Primary Key
    • Partition PK - Single unordered attribute.
    • Partition and Sort PK - Made up of two attributes. Dynamo builds an unordered index on the partition key and sorted index on the sort key.
      • Each item is uniquely identifiable by a combination of both keys.
    • Data is stored in paritions based on Partition Key. Partitions are automatically replicated across many AZs in a region.
    • Partition Keys should have high entropy to ensure data is distributed across partitions. Use GUIDs where possible.
    • This is important to avoid hot partitions.
  • DynamoDB maintains multiple copies of the data for durability. All copies are consistently typically within a couple of seconds after a write.
    • An eventually consistent read may return slightly stale data if the read operation is performed immediately after a write.
  • To ensure high throughput and low latency responses we must specify read and write capacity when we create a table.
    • Read Capacity Unit (RCU) - Number of strongly consistent reads per second of items up to 4kb in size.
    • Eventually consistent reads use half the provisioned read capacity.
    • Write Capacity Unit (WCU) - Number of 1kb writes per second.
    • Throughput is dividied evenly among partitions, this is why it’s important for data to be distributed across partitions.
  • Streams - Asynchronous stream where changes to a table are published. Can use to trigger lambdas.
  • Global Tables - Replica tables all around the world. Changes are stream to replica tables. Writes and reads are eventually consistent.
    • Concurrent writes - Last update always win. But the forgotten write will be published on the stream.
    • Strongly consistent reads require using a replica in the same region where the client is running.
  • Encryption must be enabled when the table is created. It cannot be enabled when the table contains data.

Developing with DynamoDB

  • Required: Table name, primary key (partition key), throughput capacity.
  • Optional: Global and local secondary indexes. Stream specification.
  • API: Get, Put, Update, Delete Item.
    • Must be done using the full primary key.
    • Scan - reads the entire table, almost never used.
    • Query - uses the primary and sort keys.
  • Limit data returned.
    • Pagination limits the size of data returned. eg 1MB.
    • Limit specifies the number of items returned. eg 500.

Lambda

Serverless - No infrastructure or OS to maintain. Charged for the time your function is executing, not idle time.

  • Functions are stateless so must persist data in an external store. eg S3, Dynamo, RDS, … .
  • Lambdas can be invoked by many events in AWS. eg. Dynamo or S3 data change, security policy change, … .
  • All code must be compiled with dependencies into a single zip.
    • Minimize dependencies for faster startup time of function execution.
  • Assign the lambda an IAM role to assume so that it can connect to other AWS resources (eg S3, Dynamo, …).
  • Execution Models:
    • Push Model - The lambda is invoked by the event source.
    • Must update IAM access policy so that the service of the event source can call the lambda.
    • eg. Object added to S3 bucket, invokes a lambda.
    • Pull Model - The function is invoked by the Lambda service itself.
    • Must update IAM policy on the lambda so it can access the data of the service.
    • eg. Configure lambda to poll DynamoDB for a particular type of event in the published stream.
      • The polling part is handled by the Lambda orchestration service, we just have to configure it.
  • Function execution
    • Default is 3 seconds, configurable up to 15 minutes. Charged for at minimum 100ms of execution.
    • Can schedule execution with cron expressions.
  • Versions are immutable code + configuration. Each version/alias gets it’s own ARN.
  • Charged for request numbers * execution duration * memory allocated for the lambda.

Best Practices - Use environment variables for passing in secrets to the lambda. - Avoid recursively calling the same lambda.

API Gateway

  • Proxying security appliance for your API. Inspects + validates requests. Passes them onto EC2/Lambda/ECS backend.
  • DDOS protection, authn, authz, throttle, metre API usage.
  • Can transform and validate incoming requests and responses.
    • Transforming data between Client and API Gateway is called the Method Request/Response.
    • Transforming data between API Gateway and the Backend is called the Integration Request/Response.
  • Can cache responses in API gateway to reduce load on backend.
  • Can generate Swagger and GraphQL clients based on API spec.

Best practices - If all requests come in from a handful of regions then setup regional API endpoints. - Use HTTP 500 codes for error handling.

Serverless Application Model (SAM)

  • Framework for defining template configurations of lambdas and other AWS services.
  • Gradual code deployment:
    • CanaryXPercentYMinutes - X% of traffic shifted in the first interval. Remaining traffic shifted after Y minutes.
    • LinearXPercentEveryYMinute - X% of traffic added linearly every Y minutes.

Simple Queue Service (SQS)

Use queues to achieve loose coupling between application components and asynchronous processing of messages.

Queue types:

  • Standard Queue - Ordering not guaranteed. Messages may be duplicated. High throughput.
  • FIFO (First in first out) - Guaranteed order. Deliver exactly once. Limited throughput.

 

  • Durability of messages is achieved by distribing them across many servers in the queue.
  • Message body + attributes = 256kb max size.
  • Visibility Timeout. When consumers request a message from the queue, the message gets hidden on the queue. This stops other clients from pulling it. When the timeout expires (and the consumer didn’t consume/delete the message), the message is made visible again so other clients can pull it.
    • Default 30 seconds, max 12hrs.

Reading from the queue

  • Short polling - Samples a number of servers for messages.
    • As it does not sample all servers some messages may be missed.
    • Returns immediately.
  • Long polling - All servers are queried for messages.
    • If none are available then the call stays open until a message arrives in the queue or the call times out.
    • Long polling == Fewer requests == Lower Cost
  • Dead Letter Queue - Queue of messages that were not able to be processed. Useful for debugging.
    • After a message meets the limit of processing attempts it is put in the DLQ.
    • Generally indicates something wrong with the message body.
    • Failed lambdas also have a concept of a DLQ.

Queues can be shared across AWS accounts. - Access can be controlled with permissions and policies. - Must be in the same region.

Encryption - encrypts the message body, not the data itself.

  • ~35% performance penalty.
  • Master key never leaves SQS. Message key never leaves KMS.

Visibility Timeout - Period of time a message is invisible

Simple Notification Service (SNS)

  • One to many service
    • Messages are published to a topic, subscribers listen on a topic.
    • Message size up to 256kb
    • Subscribers can be email, HTTP, SMS, SQS, mobile clients
    • Message delivery not guaranteed. Order not guaranteed. Messages can’t be deleted after publishing.
  • API: CreateTopic, DeleteTopic, Subscribe, Publish

Amazon MQ

  • Managed ApacheMQ. Direct access to the ActiveMQ console.
  • Compatible with standard message queue protocols. JMS, AMQP, MQTT, Websocket, NMS, …

AWS Step Functions

Step functions define a state machine for lambda pipelines.

  • If we have >5 lambdas making up an application pipeline then we will run in to wanting to retry tasks, execute tasks in parallel, or choose which task to execute based on the input data.
  • Step functions handle state and handle errors for complex lambda pipelines. Control logic:
    • Task - single unit of work
    • Choice - branching logic
    • Parallel - fork and join data across stacks
    • Wait - Delay for a specified time
    • Fail - Stops an execution and marks it as failure
    • Succeed - Stops an execution and marks it as success
    • Pass - Passes it’s input to it’s output
  • Triggerable by many AWS services including Cloudwatch.

               +-------+
               | Start |
               +---+---+
                   |
                   |
                   v
    
           +----------------+
           | Wait X Seconds |
           +-+--------------+
             |
             |         ^
             v         |
                       |
    +----------------+     |
    | Get job status |     |
    +------------+---+     |
             |         |
             |         |
             v         |
                       |
          +------------+--+
          | Job Complete? |
          ++------------+-+
           |            |
           |            |
           v            v
    
    +------------+       +----------------------+
    | Job Failed |       | Get Final Job Status |
    +------------+       +----------+-----------+
                                 |
                                 |
                                 v
    
                              +-----+
                              | End |
                              +-----+
    

Elasticache

  • Caching improves speed by reducing latency and lowering load on the database.
  • Elasticache offers Memecached or Redis backends. All of the benefits of Elasti Cache Redis are just normal Redis out of the box features.
    • Redis - Multiple AZ, read replicas, sharded clusters, advanced data structures
    • Memcached - Single AZ, multithreaded
  • An Elasticache cluster is made up of many nodes.
  • Clients connect to an endpoint, an address which points to a cluster.
  • Data in a cluster will be automatically spread across nodes. This means if a single node fails then you likely won’t lose all of your data.

Replication Group - Collect of clusters. One primary (read/write), up to 5 read replicas.

Methods for managing data

  • Lazy Loading - Only requested data is cached.
    • Application checks cache. If data is missing from cache then the application queries the database for the data. The application then writes that data into the cache.
  • Write Through - All data is written to the cache. Cache maintains a full copy of the data.
    • When the application writes to the database it also writes to the cache. The application only reads from the cache.

Containers

Container = Runtime + Dependencies + Code

Elastic Container Registry (ECR) - Fully managed container registry. Alternative to Dockerhub or self hosted.

Elastic Container Service (ECS) and Elastic Kubernetes Service (EKS) Deploy, schedule, auto scale, and manage containerised apps. Auto scaling spins up a new EC2 instance for ECS or EKS to deploy on.

  • ECS is an orchestration tool which can deploy containers to run on EC2 or Fargate.
  • EKS is an orchestration tool which can deploy containers to run on EC2.
  • Scaling performed by Cloudwatch events to the ECS cluster.
  • EKS can manage and deploy containers across multiple AZs and vendors (hybrid).

Security

AWS Certificates Manager (ACM) - Issues public and private TLS certificates. Handles auto renewal of certs.

Secrets Manager - Rotate, manage, and retrieve credentials and keys.

  • Grant access to specific secrets using IAM roles.
    • eg. this EC2 instance can request these secrets
  • Use Secrets Manager over Parameter Store.
  • Cannot download private keys, only public.

Security Token Service (STS) - Provides trusted users with temporary security credentials.

  • Can assign IAM policies to control privileges.
  • Users get given access keys and a session token. Lasts between 15 min up to 36 hrs (configurable).
  • Common pattern: ‘we’ authenticate the user internally, then issue them an STS token so they can interact with AWS services.

Cognito

Authentication and authorisation management using public OpenID Connect login providers (Google, Facebook, …) or SAML.

  • Use SAML to authenticate against AD or an external auth directory.
  • User pools for social sign on (external identities), manage user profiles.
  • “Cognito offers mobile identity management and data synchronisation across devices”
  • Supports MFA

Deploying Applications

Code Star - Project management, JIRA-ish. Integrates with the AWS services below.

Code Pipeline - Fully managed CI/CD pipeline. - Integrates with 3rd party tools such as Github, or the AWS services listed below

Pipeline StageAWS Service
Source (version control)Code Commit
BuildCode Build
Test
DeployCode Deploy
MonitorX-Ray, Cloudwatch

If a stage in the pipeline fails then the entire process will stop.

Gradual Deployment Strategies - simultaneously serve traffic to both environments. Only works when we don’t have a versioned 3rd party dependency or database.

  • Blue/Green. Blue == existing prod env. Green == parallel env running the new version.
    • Use Route53, a load balancer, or auto scaling groups with launch configurations to switch between the 2 deployments.
  • A/B - Sends a small percentage of traffic to the new environment. Gradually increase traffic to B.

Elastic Beanstalk

  • Provisions and manages the infrastructure.
    • Environment - Types and tiers of machines for web services and workers.
    • Code - Managed code versions stored in S3. Can switch (deploy) between versions.
    • Configuration - Configure individual services used by Beanstalk. eg. install additional packages or change the configuration of running services.
  • Configuration files should be YAML or JSON formatted, have a .config suffix and be placed in the directory .ebextensions/.
    • If a large amount of configuration is required then you can use a custom .ami.

Beanstalk requires to IAM roles:

  • Service Role - Permission for beanstalk to create and manage resources in your account. eg EC2, auto scaling groups, databases.
  • Instance Role - AWS permissions for the instance itself.
    • To allow users to manage things then they would assume the same instance role.

Cloud Formation

  • JSON templates get run by Cloudformation to create stacks.
    • If an error occurs while building the stack then the stack is rolled back and destroyed.
  • Templates can contain:
    • Parameters - Values passed in when the stack is being build. eg connection strings and secrets.
    • Mappings - Maps or switch statements. eg select the correct AMI id for region, or machine size.
    • Outputs - Return URLs, IPs, ids, names, … when calling cfn-describe-stacks.

Domain Modeling Made Functional

May 2019 · 7 minute read · tech fsharp linux dotnet book

Domain Modeling Made Functional by Scott Wlaschin is a book which guides you through the design and implementation of an e-commerce ordering system. It’s a real world application with non-trivial business requirements. The project is implemented in F#, but any other language with a powerful type system which allows you to write in a functional paradigm could be used. Scott is author of the popular website fsharpforfunandprofit.com and also a highly regarded speaker in the F# community.

Domain Modeling Made Functional Cover

The book assumes some knowledge of software design, that you’ve written any sort of software before, but assumes no knowledge of domain driven design or F#. This was ideal for me as I’m an F# hobbyist and know very little about domain driven design. One of my struggles in trying to learn functional programming has been finding resources on actually implementing a real world application. Tutorials describing monads or how to parse a simple calculator grammar don’t show you how to deal with the sorts of problems you face when writing business applications, specifically:

  • A lot of IO, with different systems (file system, databases, remote APIs) which have different latencies and uptimes.
  • Using OOP libraries (ASP.NET Core, database abstraction libraries, …) in a functional manner.
  • Handling errors through multiple layers of abstraction.
  • Ever changing business requirements, and how to structure the application to handle this.

There are great ‘success story’ conference talks about the latest bank that was written in Haskell, but unfortunately they never go into actionable detail. If you took the advice of a lot of blog posts and tutorials on the internet about writing functional applications you would know it’s a simple 3 step process

  1. Gather requirements
  2. Write the application functionally
  3. Done

These tutorials tell you everything about step #3, and nothing about #1 and #2. Domain Modeling Made Functional teaches you how to write your application using a functional architecture. It’s split into 3 secions which largely mimic the development process: gathering requirements, designing the architecture, and implementing the code. The book concludes with making some non-trivial modifications to the application, simulating a change in business requirements. I cover the 3 sections below.

I. Understanding the Domain

The first 55 pages of the book don’t contain any F# code so you can skip them. No! No! No! Gathering requirements, and then modelling the domain are a critical part in any software project. Scott describes a domain specific language specifically for describing requirements and organising them into domains. In other words, a set of language, diagrams, and questions that developers and the domain experts can use to get a shared understanding of what is being built, a ubiquitous language.

Good thing I have experience gathering software requirements and already know how to do it. Again, No! No! No! Scott emphasises the point that requirements should be gathered and organised in a domain driven manner. I learned an enormous amount in this book, but I think this section was most valuable to me. It challenged the principles and knowledge I already thought I had around requirements gathering and software design. It demonstrated that if you develop an understanding of user requirements and domains in the context of domain driven development then a functional architecture almost naturally appears out of nowhere. In contrast, if you were to develop an understanding of user requirements and domains in the traditional OOP model (database driven design or class driven design), then trying to retrofit a functional architecture or functional language on top of this will be complicated, clumsy, and difficult to understand. I suspect for most developers the unconcious default is to gather requirement in an OOP manner.

II. Modelling the Domain

The section opens with a brief primer of some of the best features of the F# language: the type system, algebraic data types, composing types, optional, and Result. This quickly brings anyone not familiar with F# up to speed. It then dives into modelling the order taking system using the F# type system, and clearly shows how the type system can be used to almost perfectly model the domain (check out Scott’s article on making illegal states unrepresentable).

Below is an example from the book of capturing business rules in the F# type system. The UnitQuantity of a product is the number of copies of the product that is being ordered. This could be modelled as an int, but we don’t really want to allow the user to order between -2,147,483,648 and 2,147,483,647 items of the product. In the example of the book we want this value to be between 1 and 1000. So how do we model this in the code?

We start by declaring our own type with a private constructor

type UnitQuantity = private UnitQuantity of int

The private constructor prevents us from creating a UnitQuantity value directly, so we can’t accidentally do

let x = UnitQuantity -500

We’ll define our own module with a function called create, we put our validation logic in there and it’s the only way to make new values of type UnitQuantity

module UnitQuantity = 

    let create qty =
        if qty < 1 then
            Error "UnitQuantity can not be negative"
        else if qty > 1000 then
            Error "UnitQuantity can not be more than 1000"
        else
            Ok (UnitQuantity qty)

    let value (UnitQuantity qty) = qty

We can then see the function in action

match UnitQuantity.create -1000 with
| Error msg -> printfn "Failure: %A" msg
| Ok qty    -> printfn "Success. Value is %A" (UnitQuantity.value qty)
 
match UnitQuantity.create 50 with
| Error msg -> printfn "Failure: %A" msg
| Ok qty    -> printfn "Success. Value is %A" (UnitQuantity.value qty)

This example of primitive wrapping is really trivial and something you’ve likely seen before. The book starts with this, but also describes how to achieve safety with much more complex types such as those which span multiple domains, or are aggregates. I found the real value of this book is how it achieves the latter, Scott’s explains the details and process really well and it’s a pattern which you can pull out for writing your own applications.

But it’s the real world. If this was used in a website or public API then in reality our UnitQuantity can only ever be a JSON integer. Using the bounded context approach we have our ‘unsafe outside world’ of JSON filled with primitives like int which contain untrusted values. Our application will have an explicit serialisation step which converts between the untrusted public DTOs (data transfer objects), and our safely typed internal representation. Any errors at this stage are appropriately returned to the user. The ‘safe inside world’, where we have types such as UnitQuantity, as well as F# algebraic data types and units of measure are used throughout our business logic. Details on how to write this are covered in Chapter 11.

III. Implementing the Model

The last section of the book describes how the application is implemented using a functional pipeline. But not just any pipeline. It’s a pipeline that describes application workflows which handle

  • Dependency injection
  • Async error handling
  • Application events
  • DTO conversion
  • Data persistence (database)

In other words, everything that is needed in a real world application. All code from the book is available on Github. It’s well commented and can be read and understood without having read the book. I think that it’s an incredible testament to F# as a language and Scott as an author, and is a perfect architecture in which to base your own application on.

There are also chapters on computation expressions for more complex error handling, techniques on serialising complex F# algebraic data types to JSON, and how to best model persistence in a database.

Conclusion

This book describes how you can architect and build an application in a functional programming language using domain driven principles. Scott writes with clarity and describes the problems or gotchas that can occur before explaining how to solve them. The diagrams in the book illustrate the pipelines and transformations that need to occur. The F# code printed in the book was clear to understand and wasn’t overwhelming. We’ve all seen Java text books where application code is printed over several pages.

I learned a huge amount from Domain Modeling Made Functional and will keep referring back to it. Techniques for modelling requirements, architecture of functional applications, tricks with the F# type system to gain application safety, error handling in pipelines, and use of computation expressions to name just a few.

If you’re interested in what I’ve written about and would like to learn more than I recommend Scott’s talk Domain Modeling Made Functional, checking out his website fsharpforfunandprofit.com, and then buying the book.


The code with UnitQuantity can be played around with here.