I recently got the AWS Developer Associate certification. Here are some tips and a copy of my study notes.
Read the AWS Exam Guide. It’s a 3 pager by AWS describing exactly what is in the test.
Purchase the Whizlabs Practice Exams. A few people had recommended these to me and I found the questions a good way to study. All questions have a detailed explanation of the correct answer and reasons why the other options weren’t correct. If you start the exam in ‘practice mode’ you can review the explanation to the question immediately after you have answered it rather than waiting until the end of the exam. You can re-take these any number of times.
I did a couple of the Whizlabs practice exams and quickly identified a couple of areas I was lacking in - AWS Security and the AWS code deployment CI/CD services. I studied these by reading AWS Whitepapers, AWS Product FAQs, and the AWS product overview pages. The aws.training website also has a bunch of good learning resources.
I recommend doing one Whizlabs practice exam per day and then studying your weak spots. Repeat this until you’re confident! I’ve copied my notes below. I think they’re all relevant for content covered in the exam but there are definitely some gaps.
An AWS region has at least 2 availability zones (AZs).
- High Availability - Application runs across 2 or more AZs. eg. uses globally replicate services.
- Fault Tolerant - Application can self heal when failure occurs. eg. auto scaling group brings up a new instance to replace a dead one.
- Auto Scaling - Launch and terminate EC2 instances/containers based on a scaling plan.
- Manual Scaling, Scheduled Scaling, Dynamic Scaling (responds to Cloudwatch events).
Shared Responsibility Model - AWS is responsible for security of the cloud. Customers are responsible for security in the cloud. eg.
- Customers: Platform, identity, data, OS updates, network configuration, encryption, …
- AWS: Physical security, data centres, hypervisor updates, implementation of managed services, …
AWS Root account should not be used for logging in or making changes. Instead create a new user in the AWS console, create an administrator group with all privileges, and put the user in the administrator group. Use this user.
- Login to the management console with username, password, and MFA.
- CLI and SDKs should use access keys.
Credentials and access keys are stored unencrypted in
~/.aws/credentials/ so don’t store any root credentials there. Credentials to use are selected in the following order.
- Embedded in code
- Environment variables
- IAM role assigned to EC2 instance
Amazon Resource Name (ARN) - Unique Identifier for every resource on AWS.
- Only one wildcard (
*) can be used in an ARN.
- 400 - Client error, bad request.
- 500 - Server error, something to do with AWS. Retry after exponential backoff.
HTTP headers and prefixed with
AWS Envelope Encryption - Data is encrypted by a plain text data key which is further encrypted using the plain text master key.
Elastic Load Balancing
- One of Application Load Balancer, Network Load Balancer, Classic Load Balancer.
- Distributes traffic across multiple EC2 instances/containers/lambdas across AZs.
- Can do healthchecks on instances and stop routing traffic to those which fail.
Developing on AWS
- Browser based IDE. Integrates well with Lambda and other AWS services.
- Used to identify errors, bugs, and performance bottlenecks in running applications.
- Visualise Lambda pipeline workflow.
- Trace requests through lambda functions and other AWS services such as DynamoDB.
- Collects and tracks AWS resource metrics. Users can specify custom metrics to collect.
- Set alarms to respond to events.
- eg. Monitoring load on EC2 instances. Set a Cloudwatch alert if an instance has a high load sustained over 5 minutes. Triggers an auto scaling policy to spin up new EC2 instance.
- Use high resolution custom alarms for measuring events which occur more frequently than 30 seconds and need to be acted upon.
- Logs all API calls against your AWS account.
- This includes all actions through the AWS Management Console, SDKs, and API.
- Captures who, when, what was requested, the response including the before state (eg. instance running), and the current state (eg. instance stopping).
- Logs can be streamed to an S3 bucket, otherwise visible in the AWS Console for 90 days.
Identity Access Management (IAM)
- Define permissions to control which AWS resources users can access.
+------+ +-------+ +------+
| User +-------> | Group | | Role |
+--+---+ +---+---+ +------+
| | ^
| | |
| | |
| +-----+------+ |
+-----------+ IAM Policy +-------+
- Create IAM policies which grant groups or roles the privileges required in order to fulfill their role.
- Roles are the preferred way of granting authorization to perform a task.
- Roles can be used to temporarily give an existing IAM user privileges. This user could be from a different AWS account.
- Only one role can be assumed at a time.
- Within IAM there are Identity based and Resource based permissions. Use both together for security.
- Identity based permissions - Applied to a user. eg. User Joe can read items in this S3 bucket.
- Resource based permissions- Applied to a resource. eg This S3 bucket can be read by user Scott.
- IAM Policies are either Managed or Inline.
- Inline Policy - Applied to a single resource, cannot be shared.
- Can be copied between resources but changes to the policy of one will not apply to others.
- Managed Policy - Standalone Identity-based policy which can be shared between the resources it’s assigned to.
- AWS Managed Policy - Very high level. eg. Can read from S3.
- Customer Managed Policy - Can be more specific. eg. at the bucket or object level.
- Managed policies are re-usable, have central change management, and support versioning + rollback.
IAM Policy Evaluation
| Evaluate all applicable policies |
| Explicit Deny? +-----> | Deny |
| Explicit Allow? +----> | Allow |
| Deny |
- By default all requests are denied.
- An explicit allow overrides this default.
- An explicit deny overrides any allow.
- Execution order of policies does not matter.
Example IAM Policy
This example shows how you might create a policy that allows Read and Write access to objects in a specific S3 bucket. This policy grants the permissions necessary to complete this action from the AWS API or AWS CLI only.
s3:*Object action uses a wildcard as part of the action name. The
AllObjectActions statement allows the
PutObject, and any other Amazon S3 action that ends with the word “Object”.
Buckets are created inside regions, not AZs.
- Objects are automatically replicated across 3 AZs within the region.
- Bucket names must be globally unique.
- Buckets must be empty before they can be deleted.
- Buckets can be encrypted with:
- S3 managed keys (SSE-S3)
- KMS managed keys (SSE-KMS)
- Customer provided keys (SSE-C)
S3 ACLs can be applied to buckets and individual objects.
- These are not managed through IAM.
- Can apply IAM-like policies directly to buckets.
- Avoid un-neccesary requests by setting metadata before uploading objects.
- Cache bucket and key names within your app.
- If serving files over HTTP use the Cloudfront CDN to lower latency and reduce costs.
- When uploading large files use the
Data Storage Services
All are managed services.
- Redshift - Data warehouse. Query with SQL. Integrates with BI packages.
- RDS - Relational databases - MySQL, PostgreSQL, SQLServer, Oracle, MariaDB, Aurora.
- Aurora - AWS developed SQL database compatible with MySQL and Postgres clients.
- Neptune - Graph database.
- DynamoDB - NoSQL database.
- Low latency NoSQL database.
- Uses IAM for permissions and access.
- Backup, restore, point in time recovery.
- Tables have items. Items have attributes.
- Tables must have one primary partition key, and many sort keys.
- Types of Primary Key
- Partition PK - Single unordered attribute.
- Partition and Sort PK - Made up of two attributes. Dynamo builds an unordered index on the partition key and sorted index on the sort key.
- Each item is uniquely identifiable by a combination of both keys.
- Data is stored in paritions based on Partition Key. Partitions are automatically replicated across many AZs in a region.
- Partition Keys should have high entropy to ensure data is distributed across partitions. Use GUIDs where possible.
- This is important to avoid hot partitions.
- DynamoDB maintains multiple copies of the data for durability. All copies are consistently typically within a couple of seconds after a write.
- An eventually consistent read may return slightly stale data if the read operation is performed immediately after a write.
- To ensure high throughput and low latency responses we must specify read and write capacity when we create a table.
- Read Capacity Unit (RCU) - Number of strongly consistent reads per second of items up to 4kb in size.
- Eventually consistent reads use half the provisioned read capacity.
- Write Capacity Unit (WCU) - Number of 1kb writes per second.
- Throughput is dividied evenly among partitions, this is why it’s important for data to be distributed across partitions.
- Streams - Asynchronous stream where changes to a table are published. Can use to trigger lambdas.
- Global Tables - Replica tables all around the world. Changes are stream to replica tables. Writes and reads are eventually consistent.
- Concurrent writes - Last update always win. But the forgotten write will be published on the stream.
- Strongly consistent reads require using a replica in the same region where the client is running.
- Encryption must be enabled when the table is created. It cannot be enabled when the table contains data.
Developing with DynamoDB
- Required: Table name, primary key (partition key), throughput capacity.
- Optional: Global and local secondary indexes. Stream specification.
- API: Get, Put, Update, Delete Item.
- Must be done using the full primary key.
- Scan - reads the entire table, almost never used.
- Query - uses the primary and sort keys.
- Limit data returned.
- Pagination limits the size of data returned. eg 1MB.
- Limit specifies the number of items returned. eg 500.
Serverless - No infrastructure or OS to maintain. Charged for the time your function is executing, not idle time.
- Functions are stateless so must persist data in an external store. eg S3, Dynamo, RDS, … .
- Lambdas can be invoked by many events in AWS. eg. Dynamo or S3 data change, security policy change, … .
- All code must be compiled with dependencies into a single zip.
- Minimize dependencies for faster startup time of function execution.
- Assign the lambda an IAM role to assume so that it can connect to other AWS resources (eg S3, Dynamo, …).
- Execution Models:
- Push Model - The lambda is invoked by the event source.
- Must update IAM access policy so that the service of the event source can call the lambda.
- eg. Object added to S3 bucket, invokes a lambda.
- Pull Model - The function is invoked by the Lambda service itself.
- Must update IAM policy on the lambda so it can access the data of the service.
- eg. Configure lambda to poll DynamoDB for a particular type of event in the published stream.
- The polling part is handled by the Lambda orchestration service, we just have to configure it.
- Function execution
- Default is 3 seconds, configurable up to 15 minutes. Charged for at minimum 100ms of execution.
- Can schedule execution with cron expressions.
- Versions are immutable code + configuration. Each version/alias gets it’s own ARN.
- Charged for request numbers * execution duration * memory allocated for the lambda.
- Use environment variables for passing in secrets to the lambda.
- Avoid recursively calling the same lambda.
- Proxying security appliance for your API. Inspects + validates requests. Passes them onto EC2/Lambda/ECS backend.
- DDOS protection, authn, authz, throttle, metre API usage.
- Can transform and validate incoming requests and responses.
- Transforming data between Client and API Gateway is called the Method Request/Response.
- Transforming data between API Gateway and the Backend is called the Integration Request/Response.
- Can cache responses in API gateway to reduce load on backend.
- Can generate Swagger and GraphQL clients based on API spec.
- If all requests come in from a handful of regions then setup regional API endpoints.
- Use HTTP 500 codes for error handling.
Serverless Application Model (SAM)
- Framework for defining template configurations of lambdas and other AWS services.
- Gradual code deployment:
- CanaryXPercentYMinutes - X% of traffic shifted in the first interval. Remaining traffic shifted after Y minutes.
- LinearXPercentEveryYMinute - X% of traffic added linearly every Y minutes.
Simple Queue Service (SQS)
Use queues to achieve loose coupling between application components and asynchronous processing of messages.
- Standard Queue - Ordering not guaranteed. Messages may be duplicated. High throughput.
- FIFO (First in first out) - Guaranteed order. Deliver exactly once. Limited throughput.
- Durability of messages is achieved by distribing them across many servers in the queue.
- Message body + attributes = 256kb max size.
- Visibility Timeout. When consumers request a message from the queue, the message gets hidden on the queue. This stops other clients from pulling it. When the timeout expires (and the consumer didn’t consume/delete the message), the message is made visible again so other clients can pull it.
- Default 30 seconds, max 12hrs.
Reading from the queue
- Short polling - Samples a number of servers for messages.
- As it does not sample all servers some messages may be missed.
- Returns immediately.
- Long polling - All servers are queried for messages.
- If none are available then the call stays open until a message arrives in the queue or the call times out.
- Long polling == Fewer requests == Lower Cost
- Dead Letter Queue - Queue of messages that were not able to be processed. Useful for debugging.
- After a message meets the limit of processing attempts it is put in the DLQ.
- Generally indicates something wrong with the message body.
- Failed lambdas also have a concept of a DLQ.
Queues can be shared across AWS accounts.
- Access can be controlled with permissions and policies.
- Must be in the same region.
Encryption - encrypts the message body, not the data itself.
- ~35% performance penalty.
- Master key never leaves SQS. Message key never leaves KMS.
Visibility Timeout - Period of time a message is invisible
Simple Notification Service (SNS)
- One to many service
- Messages are published to a topic, subscribers listen on a topic.
- Message size up to 256kb
- Subscribers can be email, HTTP, SMS, SQS, mobile clients
- Message delivery not guaranteed. Order not guaranteed. Messages can’t be deleted after publishing.
- API: CreateTopic, DeleteTopic, Subscribe, Publish
- Managed ApacheMQ. Direct access to the ActiveMQ console.
- Compatible with standard message queue protocols. JMS, AMQP, MQTT, Websocket, NMS, …
AWS Step Functions
Step functions define a state machine for lambda pipelines.
- If we have >5 lambdas making up an application pipeline then we will run in to wanting to retry tasks, execute tasks in parallel, or choose which task to execute based on the input data.
- Step functions handle state and handle errors for complex lambda pipelines. Control logic:
- Task - single unit of work
- Choice - branching logic
- Parallel - fork and join data across stacks
- Wait - Delay for a specified time
- Fail - Stops an execution and marks it as failure
- Succeed - Stops an execution and marks it as success
- Pass - Passes it’s input to it’s output
- Triggerable by many AWS services including Cloudwatch.
| Start |
| Wait X Seconds |
| Get job status | |
| Job Complete? |
| Job Failed | | Get Final Job Status |
| End |
- Caching improves speed by reducing latency and lowering load on the database.
- Elasticache offers Memecached or Redis backends. All of the benefits of Elasti Cache Redis are just normal Redis out of the box features.
- Redis - Multiple AZ, read replicas, sharded clusters, advanced data structures
- Memcached - Single AZ, multithreaded
- An Elasticache cluster is made up of many nodes.
- Clients connect to an endpoint, an address which points to a cluster.
- Data in a cluster will be automatically spread across nodes. This means if a single node fails then you likely won’t lose all of your data.
Replication Group - Collect of clusters. One primary (read/write), up to 5 read replicas.
Methods for managing data
- Lazy Loading - Only requested data is cached.
- Application checks cache. If data is missing from cache then the application queries the database for the data. The application then writes that data into the cache.
- Write Through - All data is written to the cache. Cache maintains a full copy of the data.
- When the application writes to the database it also writes to the cache. The application only reads from the cache.
Container = Runtime + Dependencies + Code
Elastic Container Registry (ECR) - Fully managed container registry. Alternative to Dockerhub or self hosted.
Elastic Container Service (ECS) and Elastic Kubernetes Service (EKS) Deploy, schedule, auto scale, and manage containerised apps. Auto scaling spins up a new EC2 instance for ECS or EKS to deploy on.
- ECS is an orchestration tool which can deploy containers to run on EC2 or Fargate.
- EKS is an orchestration tool which can deploy containers to run on EC2.
- Scaling performed by Cloudwatch events to the ECS cluster.
- EKS can manage and deploy containers across multiple AZs and vendors (hybrid).
AWS Certificates Manager (ACM) - Issues public and private TLS certificates. Handles auto renewal of certs.
Secrets Manager - Rotate, manage, and retrieve credentials and keys.
- Grant access to specific secrets using IAM roles.
- eg. this EC2 instance can request these secrets
- Use Secrets Manager over Parameter Store.
- Cannot download private keys, only public.
Security Token Service (STS) - Provides trusted users with temporary security credentials.
- Can assign IAM policies to control privileges.
- Users get given access keys and a session token. Lasts between 15 min up to 36 hrs (configurable).
- Common pattern: ‘we’ authenticate the user internally, then issue them an STS token so they can interact with AWS services.
Authentication and authorisation management using public OpenID Connect login providers (Google, Facebook, …) or SAML.
- Use SAML to authenticate against AD or an external auth directory.
- User pools for social sign on (external identities), manage user profiles.
- “Cognito offers mobile identity management and data synchronisation across devices”
- Supports MFA
Code Star - Project management, JIRA-ish. Integrates with the AWS services below.
Code Pipeline - Fully managed CI/CD pipeline.
- Integrates with 3rd party tools such as Github, or the AWS services listed below
|Pipeline Stage||AWS Service|
|Source (version control)||Code Commit|
If a stage in the pipeline fails then the entire process will stop.
Gradual Deployment Strategies - simultaneously serve traffic to both environments. Only works when we don’t have a versioned 3rd party dependency or database.
- Blue/Green. Blue == existing prod env. Green == parallel env running the new version.
- Use Route53, a load balancer, or auto scaling groups with launch configurations to switch between the 2 deployments.
- A/B - Sends a small percentage of traffic to the new environment. Gradually increase traffic to B.
- Provisions and manages the infrastructure.
- Environment - Types and tiers of machines for web services and workers.
- Code - Managed code versions stored in S3. Can switch (deploy) between versions.
- Configuration - Configure individual services used by Beanstalk. eg. install additional packages or change the configuration of running services.
- Configuration files should be YAML or JSON formatted, have a
.config suffix and be placed in the directory
- If a large amount of configuration is required then you can use a custom
Beanstalk requires to IAM roles:
- Service Role - Permission for beanstalk to create and manage resources in your account. eg EC2, auto scaling groups, databases.
- Instance Role - AWS permissions for the instance itself.
- To allow users to manage things then they would assume the same instance role.
- JSON templates get run by Cloudformation to create stacks.
- If an error occurs while building the stack then the stack is rolled back and destroyed.
- Templates can contain:
- Parameters - Values passed in when the stack is being build. eg connection strings and secrets.
- Mappings - Maps or switch statements. eg select the correct AMI id for region, or machine size.
- Outputs - Return URLs, IPs, ids, names, … when calling