AWS AppSync is a fully managed service that allows developers to deploy GraphQL API’s on the AWS platform.
- Real time collaboration apps
- Real time chat applications
- Real time IoT dashboard
- Unified microservices access (access and combine data from multiple services)
- Unified data access (retrieve or modify data from multiple data sources)
- Offline application sync (synchronise data between mobile/web applications and cloud)
Amazon Machine Image
Amazon Machine Images provide the information required to launch an instance. They are used to create virtual machines within EC2.
AWS Systems Manager Parameter Store
AWS Systems Manager Parameter Store allows for the secure storage of configuration data and secrets. Passwords, database Strings, licence code and parameter values can all be stored in the Parameter Store.
AWS Secrets Manager
AWS Secrets Manager allows for the secure storage of secrets and keys. The service allows for the easy rotation, management and retrieval of keys such as database credentials, API Keys and other secrets.
This means that developers no longer have to hard code sensitive information into their application.
The difference between AWS Secrets Manager and AWS Systems Manager Parameter Store is that Secrets Manager was designed specifically for sensitive information like secret API keys and database credentials. Values stored in secrets manager are encrypted by default and allows for the easy management of the keys.
AWS Systems Manager Parameter Store is designed to store a much wider range of information such as configuration files, license keys etc. The information can be stored encrypted or unencrypted.
AWS KMS (Key Management Service)
AWS Key Management Service allows for the easy creation and management of cryptographic keys.
AWS KMS is the service that encrypts other information in AWS services. For example the secrets stored in AWS Secrets Manager are encrypted using the keys in AWS KMS.
This is beneficial as it allows for greater compliances, granular access, permissions and auditing of the keys.
Amazon RDS (Relational Database Service)
Amazon RDS allows for the creation, management of operation of relational databases in AWS. All common database engines are supported such as PostgreSQL, MySQL, Oracle Database and SQL Server.
Amazon DynamoDB is a NoSQL database native to AWS.
The WebSocket API in API Gateway are bidirectional. This means that clients can send messages to a service and the service can send messages to the client without the client having to explicitly make requests. With WebSockets, servers can push messages to a client.
WebSockets are frequently used in applications such as chat applications, collaboration platforms, multiplayer games and financial trading platforms.
Amazon Aurora DB
Aurora DB is a relational database built by Amazon specifically for the cloud. It is compatible with MySQL and PostgreSQL.
It is up to 5 times faster than a MySQL database and is fully managed by AWS.
How to choose an AWS Region?
- Compliance – Data governance, regulations and legal requirements may mean that data has to be situated within a particular region and never leave that region.
- Proximity – It is best to choose regions closest to the customers to reduce latency.
- Available services – New services and features are not always available in every region.
- Pricing – The price of services can vary region to region.
What are AWS Availability Zones?
Each AWS Region has its own set of availability zones.
An availability zone is one or more discrete data centres with it’s own power, networking and connectivity.
This allows each Region to have a set of redundancies in the case that there are outages in one or more of the availability zones.
Each availability zone is connected with high bandwidth, ultra-low latency networking. Together these availability zones form a Region.
Anything that ends in a letter is an availability zone (AZ) e.g. eu-west-2a
What are Edge Locations (Points of presence)
AWS has hundred of edge locations which allows data to be served to customers all around the world at the lowest latency.
IAM and AWS CLI
The root account is created by default and it should not be used or shared. The permissions are too high for general usage.
Users are people within the organisation and can be grouped.
Groups only contain other Users and not other groups.
Users don’t have to belong to a group, and users can belong to multiple groups.
The reason that we create users groups is that it is easier to apply permissions to groups and users.
Users or Groups can be assigned Json documents called policies, an example of what a policy looks like is:
Policies define the permissions of the users.
The least privilege principle should be applied – only give users the permissions they require. Nothing more.
Policies are JSON documents that determine what actions a user, role, or member of a user group can perform, on which AWS resources, and under what conditions.
There are two types of policies:
- Managed policies – managed by AWS
- Manual policies – manually created and managed by you
IAM allows users to generate strong passwords to protect account and in addition use multi-factor authentication (MFA).
Password policies can be chosen by the Account administrators.
MFA is secure as it relies on the user knowing the password but also owning a security device (e.g. their phone with an authenticator app on it).
The MFA device options in AWS are:
- Virtual MFA device e.g. Google Authenticator app or the Authy App. (Supports multiple tokens on a single device)
- Universal 2nd Factor (U2F) Security Key. This is a physical device. Supports multiple roots and IAM users using a single security key
- Hardware Key Fob MFA device (device that generates codes on screen)
- Hardware Key Fob MFA device for AWS GovCloud (USA). Device that generates code specifically for the AWS GovCloud.
Users can access AWS in three different ways:
- AWS Management Console (Password and MFA protected)
- AWS Command Line Interface CLI (Access Key protected)
- AWS Software Development Kit SDK (Access Key protected)
Access keys are generated through the AWS Console and users can generate their own keys.
AWS CLI shell in the cloud that can be run from the browser.
The credentials that the CloudShell uses is the credentials of the user that is currently logged into the console that you are using the CloudShell in.
There are some AWS Services that will need to perform actions on our behalf, but they will need to permissions to perform these actions.
The permissions can be assigned to these services with IAM Roles. Roles are the same as Users excepted their intended use is by services and not by humans.
Users for humans / people.
Roles for services / not people.
For example an EC2 Instance might want to access another resource in AWS, such as DynamoDB. In order to do this, it will need to be assigned a role with the correct permissions so that the EC2 Instance can access these resources.
IAM Security Tools
IAM Credentials Report (account-level) – this generates a report that lists all of your account’s users and the status of their various credentials.
IAM Access Advisor (user-level) – shows the service permissions granted to a user and when those services were last accessed. This information can used to revise policies and help adhere to the Least Privilege Principle.
Budgets can be setup in AWS to help control spend and alert you when you have exceeded or are going to exceed the budgets set.
Thresholds can be set when Actual costs are met and also when Forecasted costs are met.
EC2 = Elastic Cloud Compute.
EC2’s can be configured with different options:
- Operating systems such as Linux, Windows or Mac OS
- Compute sower (CPU)
- Random-access memory (RAM)
- Storage space (network attached or hardware storage)
- Network card (speed of the card)
- Firewall rules
- Bootstrap script
- Running commands when a machine starts.
- It is a script that is run once when a machine starts and can perform tasks such as performing updates, installing software etc.
EC2 Instance Types
- General Purpose (T) – Good balance between compute, memory and networking
- Compute optimized (C) – Batch processing, high performance (HPC), machine learning, gaming servers
- Memory optimized (R and X) – High performance in memory databases e.g. Redis / Elasticache, real time processing of big unstructured data
- Storage optimized (I, D H) – Storage intensive tasks e.g. sequential read and writes. High frequency online transactions.
Security Groups provide network security in AWS. They control traffic into and out of the EC2 instances.
Security Groups only contain allow rules.
Security Groups can reference by IP or by other Security Groups.
Security Groups act as firewalls on the EC2 instances that regulate the access to ports, authorised IP ranges, control of inbound network and outbound network.
Security Groups can be attached to multiple instances, are locked down to a region / VPC combination and they exist outside the EC2. What this means is that the Security Group isn’t running on the EC2 instance so if some traffic is blocked the EC2 instance will never see it.
Security Groups are region scoped.
Best practice is to maintain one separate security group just for SSH access.
If your application is not accessible due to a timeout, then it’s a security group issue.
If your application gives a connection refused then that means the Security Group was not the issue and instead it is likely that application has not launched or that there is an error with the application.
By default all inbound network traffic is blocked and all outbound traffic is allowed.
Security Groups can also authorize traffic to and from other Security Groups. This is beneficial since if an EC2 instance needs to allow incoming traffic from another EC2 instance, a Security Group could be setup to allow inbound traffic from another IP address.
But the problem with this implementation is that the public IP addresses of EC2 instances change every time an instance is stopped, restarted or terminated meaning that the Security Group would have to updated every time with the new IP address.
So instead, a Security Group could be assigned to the EC2 instance that needs to send inbound traffic to our EC2 instance so that even if the EC2 instance IP address changes, we won’t have to update the IP address since the EC2 is assigned to a Security Group that is allowed to send traffic to our original instance.
22 – SSH (Secure shell – linux and macs)
21 – FTP (uploading files)
22 – SFTP (uploading files using SSH)
80 – HTTP (access unsecured websites)
443 – HTTPS (access secured websites)
3389 – RDP (remote desktop protocol – log into a windows instance)
SSH into EC2 Instance
To SSH into an EC2 instance run the following command:
ssh -i <<PRIVATE-KEY.PEM>> ec2-user@<<EC2-PUBLIC-IP-ADDRESS>>
If you receive an error like:
Permissions 0644 for 'PRIVATE-KEY.PEM' are too open.
Then the private key file needs to be protected further before you will be able to SSH into the EC2 instance.
This can be done by running the command:
chmod 0400 PRIVATE-KEY.PEM
EC2 Instance Connect
EC2 Instance Connect allows you to SSH into an EC2 instance using a web browser.
It uses temporary SSH keys to access the EC2 instance.
It comes with the AWS CLI pre-installed.
However AWS Configure should never be run on the EC2 Instance Connect as it means that you will be adding your private keys onto the instance and then anyone can run commands on the EC2 instance under your credentials. Instead IAM Roles should be used.
Roles can be attached to the EC2 instance so that it has the permissions to access other AWS services.
EC2 Instances Purchasing Options
- On-Demand Instances – best for short and uninterrupted workload, predictable pricing. Pay for what you use. It has the highest cost but no upfront payment is required.
- Linux instances – billed per second after the first minute
- All other OS’s – billed per hour
- Reserved Instances – Minimum 1 (or 3) year commitment but big price discounts compared to on-demand
- Reserved Instances – long workloads
- Convertible Reserved Instances – long workloads with flexible instances
- Scheduled Reserved Instances – every Friday 9am – 12 pm
- Spot Instances – short workloads, cheap, can lose instances (less reliable). Very large price discounts (up to 90%)
- Instances that could be lost at any point if the amount you are willing to pay for the instance is less than the current spot price
- Useful for workloads that are resilient to failure. e.g. batch jobs, image processing, data analysis, distributed workloads
- No suitable for critical applications or databases
- Dedicated Hosts – book an entire physical server and control the instance placement.
- Useful if you have compliance requirements and reduce costs by allowing you to use your own server-bound software licenses.
- Minimum 3 year contract and more expensive
- BYOL (Bring your own license) if you have a complicated licensing model.
- Strong regulatory or compliance needs
- Dedicated Instances – Instances running on hardware that’s dedicated to you, but may share hardware with other instances in the same account. There is also no control over instance placement (hardware can be moved after stop/start)
EC2 Instance Storage
An Elastic Block Store (EBS) Volume is a network drive you can attach to instances while they run.
It allows instances to persist data even after termination. Volumes can be remounted to an EC2 instance and all the previous data will be available.
EBS Volumes can only be mounted to one instance at a time and they are bound to specific availability zones.
EBS Volumes are network drives that can be detached from an EC2 instance and attached to another one quickly. This is beneficial in case of failure of an EC2 instance.
EBS Delete on Termination attribute
This controls the EBS behaviour when an EC2 instance terminates.
By default the root EBS volume is deleted (attribute is enabled).
By default any other attached EBS volume is not deleted (attribute disabled).
The use case for this is if you wanted to preserve the root volume when an instance is terminated.
EBS Snapshots allow you to backup your EBS volume at a point in time, hence why it is called a snapshot.
Snapshots can be taken when the EBS volume is attached to the EC2 Instance but it is recommended to detach the volume first before taking the snapshot.
Snapshots can be copied across availability zones (AZ) and regions.
AMI – Amazon Machine Image
AMI’s are a customisation of an EC2 instance.
When deploying an EC2 instance, you can use the ones provided by AWS (public AMI’s) or create/customise your own.
Advantages of creating your own AMI
- Creating your own allows you to own your software, configuration, operating system, monitoring etc.
- Faster boot times because all of the software and tools your require for your instance are already installed and pre-packaged. They don’t need to be installed afterwards once the instance is started.
Disadvantages of creating your own AMI
- You have to maintain and manage your own instances. Including patching, security and keeping the operating systems and packages up to date.
AMI’s are built for a specific region and then can be copied across to other regions.
EC2 instances can be launched three ways:
- A public AMI – provided by AWS
- Your own AMI – you make and maintain them yourself
- AWS Marketplace AMI – an AMI someone else has made and is available to use for free or purchase
How to create an AMI from an EC2 Instance
- Start an EC2 instance and customise it
- Stop the instance (for data integrity)
- Build an AMI – this will also create EBS snapshots
- Launch and instance from the AMI that was created
EC2 Instance Store
EBS volumes are network drives with good but “limited” performance.
For high-performance hardware disks, use EC2 Instance Store. This is a physical hard drive attached to the physical server.
EC2 Instance Stores have:
- Better I/O performance e.g. throughput
- Very high IOPS
But EC2 Instance Stores are ephemeral meaning they lose their storage if they are stopped. This is why EC2 Instance Store is not good for long term storage. If long term storage is required them EBS Volumes should be used.
Being ephemeral this means that there is a risk of data loss if the hardware fails. So it is important that data in the EC2 Instance Store is backed up.
Use cases for EC2 Instance Store:
- Scratch Data
- Temporary Content
EBS Volume Types Overview
EBS Volumes are defined by their size, throughput and IOPS (I/O per second)
|gp2/gp3||General purpose SSD||General purpose storage that balances price and performance for a variety of workloads|
|io1/io2||High performance SSD||Mission critical low latency or high-throughput workloads|
|st1||Low cost HDD||Frequently accessed, throughput intensive workloads|
|sc1||Lowest cost HDD||Less frequently accessed workloads|
Only gp2/gp3 and io1/io1 can be used as boot volumes.
General purpose SSD
Cost effective, low latency.
The use cases are: System boot volumes, virtual desktops, development and test environments.
It is important to note for the exam that gp3 is the newer version compared to gp2.
In gp3 you can independently set the IOPS and the throughput. In gp2 you cannot do this, they are linked together.
IOPS – 3,000 – 16,000
Provisioned IOPS (PIOPS) SSD
This is used in critical business applications that require sustained IOPS performance, or applications that need more than 16,000 IOPS.
This is ideal for database workloads where storage performance and consistency is critical.
If this is the case then the solution is to switch from a gp2/gp3 SSD to a io1/io2 SSD.
If over 32,000 IOPS is required, then you will need EC2 Nitro with io1 or io2.
io2 Block Express can provide sub-millisecond latency.
PIOPS supports EBS multi-attach.
IOPS (io1 / io2) – 16,000 – 32,000 or 64,000 if using a Nitro EC2.
IOPS (io2 Block Express) – 16,000 to 256,000
Hard Disk Drives (HDD)
Cannot be a boot volume
st1 (Throughput Optimized HDD) is good for Big Data, Data Warehouses, Log Processing
sc1 (Cold HDD) is good for data that is infrequently accessed and is very low cost
IOPS – 500 max
IOPS (cold hdd) – 250 max
EBS Multi-attach is for the io1/io2 EBS Volume Types.
It allows for the same EBS volume to be attached to multiple EC2 instances in the same availability zone.
Each instance will have full read and write permissions to the volume.
The use case for EBS multi-attach is to achieve higher application availability and for applications that require concurrent write operations.
EFS – Elastic File System
EFS is a managed network file system (NFS) that can be mounted on many EC2 instances in multi-availability zones.
This is the key difference between EBS and EFS – EBS is locked into a single availability zone whereas EFS is available across many availability zones.
Advantages of EFS
- Highly available
- Can be encrypted using KMS
Disadvantages of EFS
- Expensive (pay per use – the more storage you use the more you pay)
- Only compatible with linux based AMI (not windows)
Use cases for EFS
- Content Management
- Web Serving
- Data Sharing
EFS uses the standard NFSv4.1 protocol.
Access to the EFS is controlled via security groups.
EFS Performance and Storage Classes
- EFS Scale
- 1000s of concurrent NFS clients, 10GB+/s throughput
- Grow to petabyte-scale network file system
- Performance mode (set at EFS creation time)
- General purpose (default) for latency sensitive use cases (web servers, CMS etc.)
- Max I/O – higher latency and throughput, highly parallel (big data, media processing)
- Throughput mode
- Bursting (1TB storage you get 50 MiB/s + burst up to 100 MiB/s)
- Provisioned: set your throughput regardless of storage size e.g. 1GiB/s for 1TB storage if you had a small file system but require a high throughput.
- Storage tiers (lifecycle management feature – move file after N days)
- Standard – for frequently accessed files
- Infrequent access (EFS-IA) – cost to retrieve files, lower price to store
EBS and EFS Comparison
EFS is a network file system that can be mounted across multiple instances.
EBS is a network volume that only needs to be mounted on one instance.
Instance Store allows for maximum IO but is ephemeral.
|Instances||Can only be attached to one instance at a time||Can be mounted to hundreds of instances|
|Availability Zones (AZ)||Locked into one Availability zone at a time||Can be mounted across availability zones|
|Billing||Provision and pay up front for the resources you need||Pay for what you use|
- gp2 – IO increases if the disk size increases
- io1 – can increase IO independently
To migrate an EBS volume across availability zone there are several steps:
- Take a snapshot
- Restore the snapshot to another AZ
EBS backups use IO so EBS backups should not be run while the application is handling a lot of traffic.
By default, EBS volumes get terminated when the EC2 instance is terminated, but this can be changed.
EFS can only be used with linux instances (POSIX)
EFS is more expensive than EBS (about 3x)
But there is EFS-IA (infrequent access) for cost savings
Instance store provides the maximum amount of IO on an EC2 instance but it is ephemeral, so the data is lost once the instance terminates.
Very high IOPS e.g. 210,000
Overview of the different EBS Volumes
Scalability is the property of an application whereby it can handle greater loads by adapting.
- Increasing the resources of a computer e.g. increasing the CPU and memory capacity.
- Increasing the size of the instance e.g. t2.micro to t2.large.
- Vertical scalability is common for non-distributed systems such as databases (RDS, Elasticache)
- There is usually a hardware limit to how much something can be vertically scaled.
- Adding more computers to the system e.g. increasing the number of computers running the application from 1 to 5. This is also know as elasticity.
- Increasing the number of instances.
- Horizontal scaling means having a distributed system.
High availability means running an application or system across at least 2 data centres (Availability Zones).
High availability goes hand in hand with horizontal scaling.
The reason to have a highly available system is so that if one data centre goes down then at least there is another data centre that can handle the requests.
Passive high availability – When high availability is managed for you and built into the service e.g. RDS Multi AZ
Active high availability – Ensuring your application is available through horizontal scaling.
Elastic Load Balancing (ELB)
Load balancers are servers that forward internet traffic to multiple servers (EC2 instances) downstream.
Load balancers allow for the spreading of load across multiple downstream instances but only exposes a single point of access (DNS) to the application.
Load balancers automatically perform health checks on downstream instances so it knows when / when not to send traffic to downstream instances.
The health check is done on a port and a route (
/health is a common route). If the response from the health check is not
200 OK then the instance is considered unhealthy and no traffic will be sent to that route. Health check timings can be configured (e.g. every 5 seconds).
Load balancers also provide SSL (HTTPS) to websites and help to enforce stickiness with cookies.
Load balancers also provide the system with high availability across availability zones.
And it also makes it easier to separate public traffic from private traffic.
Load Balancers are regional.
Advantages of AWS EC2 ELB (Elastic Load Balancer)
- Managed by AWS
- AWS guarantees that it will be working
- AWS takes care of upgrades maintenance, availability
- AWS provides some configuration for the ELB
- Highly integrated with other AWS services speeding up development time.
Disadvantages of AWS EC2 ELB
- More expensive than setting up your own
Types of AWS Load Balancer
ELB’s in AWS can either be public or private and there are three types of managed load balancer in AWS:
- Classic Load Balancer (V1 old generation – 2009)
- HTTP, HTTPS, TCP
- Application Load Balancer (V2 new generation – 2016)
- HTTP, HTTPS, WebSocket
- Network Load Balancer (V2 new generation – 2017)
- TCP, TLS (secure TCP) and UDP
Overall it is recommended to use the newer generation load balancers.
Load Balancer Security Groups
The recommended architecture for allowing traffic to pass from an external source to an EC2 instance through a load balancer is to allow HTTP/HTTPS traffic from any external source to the load balancer, and then have a Security Group that allows traffic to the EC2 instance only from the Security Group associated with the ALB.
Load balancers can scale but not instantaneously.
Load Balancer Troubleshooting
- 4xx errors are client induced errors
- 5xx errors are application induced errors
- Load balancer 503 errors means there is no capacity or there is no registered target for the load balancer to direct traffic to
- If the load balancer can’t connect to your application then check the security groups
Load Balancer Monitoring
- ELB access logs will log all the access requests to the load balancer which is useful for debugging requests
- CloudWatch metrics will give aggregate statistics (e.g. the number of connections)
Load Balancer Comparison
|Load Balancer||Version||Protocols||Health Checks||Hostname|
|Classic Load Balancer||V1 (Old generation)||– TCP (layer 4)|
– HTTP and HTTPS (layer 7)
|TCP or HTTP based||Fixed hostname e.g.|
|Application Load Balancer||V2 (New generation)||– HTTP (layer 7)|
– HTTP/2 and WebSocket
– Supports HTTP -> HTTPS redirects
|Performed at the target group level||Fixed hostname e.g.|
|Network Load Balancer||V2 (New generation)||– Forward TCP and UDP traffic to instances (layer 4)||Has a static IP and supports elastic IP|
Application Load Balancer (ALB)
- Allows for load balancing to multiple HTTP applications across machines (target groups)
- Allows for load balancing to multiple application on the same machine (e.g. containers)
- Allows for routing to different target groups
- Routing based on path in URL (
- Routing based on hostname in URL (
- Routing based on query strings and headers (
- Routing based on path in URL (
- The application servers behind an ALB don’t see the IP of the client directly
- The true IP of the client is inserted into the header
- The same is true for the Port
- The true IP of the client is inserted into the header
Advantages of ALB’s
- ALB’s are the correct choice for microservices and container based applications.
- They have a port mapping feature to redirect to dynamic port in ECS.
- ALB’s are more powerful compared to Classic Load Balancers because you only require one ALB to redirect to multiple applications, whereas you would need multiple classic load balancers for each application in your system.
- ALB’s can route to multiple target groups simultaneously
Target Groups of ALB’s
- EC2 Instances (HTTP)
- ECS tasks (HTTP)
- Lambda Functions (HTTP request is translated into a JSON event)
- IP addresses (must be private IP’s)
Network Load Balancer (NLB)
- Very high performance (good when extreme performance is required – TCP and UDP traffic)
- Can handle millions of requests per second
- Very low latency (100ms compared to 400ms for ALB)
NLB has one static IP per AZ and supports assigning an elastic IP. This is useful for whitelisting specific IP’s.
NLB’s expose a public static IP. This is useful for compliance purposes as firewall rules that are stable can be approved by regulators.
Elastic Load Balancer Stickiness
Stickiness is where the same client is always redirected to the same instance behind a load balancer.
Classic Load Balancers and Application Load Balancers can be sticky.
The use case for this may be that a user doesn’t want to lose their session data so they should always be directed to the same instance.
The disadvantages of stickiness is that it may bring imbalance as the load is no longer being evenly distributed across all the backend EC2 instances.
Cross Zone Balancing
Cross zone load balancing allows for traffic to be distributed across different availability zones even if one of the zones has more compute capacity compared to the other zones.
Each load balancer instance distributes evenly across all registered instances in all availability zones.
|ELB||Cross Zone||Inter-AZ data transfer|
|Application Load Balancer (ALB)||Always on – can’t be disabled||No charges|
|Network Load Balancer (NLB)||Disabled by default||Charges for data transfer|
|Classic Load Balancer (CLB)||Through the console – enabled by default|
Through CLI/API – disabled by default
SSL & TLS
An SSL (Secure Socket Layer) certificate encrypts the traffic in-flight between a client and server (or load balancer).
TLS (Transport Layer Security) is the newer more secure version of SSL.
Load balancers in AWS use X.509 certificates to provide the TLS encryption.
These certificates can be created and managed in ACM (AWS Certificate Manager), but you can also upload your own certificates.
Server Name Indication (SNI)
SNI solves the problem of loading multiple SSL certificates onto one web server to serve multiple websites.
The newer protocol requires the client to indicate the hostname of the target server in the initial SSL handshake. The server will then find the correct certificate or return the default one.
SNI only works with the newer generation V2 ALB and NLB and CloudFront.
SNI does not work with CLB.
Elastic Load Balancers – SSL Certificates
|Load Balancer||Number of SSL Certificates supported||How it works|
|Classic Load Balancer (V1)||Only 1||To have multiple hostnames with multiple certificates then this can be achieved with multiple CLB’s and multiple SSL certificates.|
|Application Load Balancer (V2)||Multiple listeners with multiple SSL certificates||Uses Server Name Indication SNI to make it work|
|Network Load Balancer (V2)||Multiple listeners with multiple SSL certificates||Uses Server Name Indication SNI to make it work|
ELB Connection Draining
Connection draining is the time to complete “in-flight” requests while the instance is deregistering or unhealthy.
When an instance is deregistering (draining) the ELB will stop sending new requests to the instance.
|Classic Load Balancer||Connection Draining|
|Application Load Balancer||Target Group: Deregistration Delay|
|Network Load Balancer||Target Group: Deregistration Delay|
Deregistration delay is 300 seconds by default but it can be set to anywhere between 1 and 3600 seconds.
It can also be disabled (set to 0).
Deregistration delay should be set to a low value if your requests are short. e.g. if requests usually only take 5 seconds, then deregistration delay can be set to something like 20 seconds since you know all requests should be completed by that point.
Auto Scaling Groups
Auto scaling groups help to achieve the goal of:
- Scaling out (adding EC2 instances) to match an increased load
- Scaling in (removing EC2 instances) to match a decreased load
- Ensure there is a minimum and maximum number of machines running.
- Automatically register new instances to a load balancer.
Auto scaling groups (ASG’s) have the following attributes:
- Launch Configuration
- AMI + instance type
- EC2 User Data
- EBS Volumes
- Security Groups
- SSH Key Pair
- Min and max size and initial capacity
- Network and subnet information
- Load balancer information
Auto Scaling works with both Application Load Balancers and Network Load Balancers
Auto Scaling alarms
ASG’s can be scaled when triggered by CloudWatch alarms.
CloudWatch alarms monitor a metric such as average CPU usage across all ASG instances.
New Auto Scaling Rules make it easier to define better rules:
- Target average CPU usage.
- Number of requests on the ELB per instance
- Average Network In
- Average Network Out
Auto scaling can be configured with custom metrics as well:
- Scale based on a custom metric such as number of connected users. This work by:
- Sending custom metric from EC2 to CloudWatch
- Create CloudWatch alarm to react to low/high values
- Use the CloudWatch alarm as the scaling policy for ASG
ASG’s can also be set based on a schedule e.g. if you know there is going to be high traffic at 9am on weekdays.
ASG important notes
ASG’s use Launch Configurations or Launch Templates (newer) and to update an ASG a new launch configuration or launch template must be provided.
IAM Roles attached to an ASG will automatically get assigned to new EC2 instances.
ASG’s are free and you only pay for the underlying resources.
Using ASG’s has benefits such as if an instance gets terminated for some reason e.g. it crashes then the ASG will automatically launch a new instance to replace it.
ASG’s can also replace instances that have been marked as unhealthy by a load balancer.
Auto Scaling Groups – Scaling Policies
- Target Tracking Scaling
- Most simple and easy to setup
- e.g. Average ASG CPU should be around 40%
- Simple/Step scaling
- Triggered by CloudWatch alarm
- e.g. when CPU > 70% add two more units.
- e.g. when CPU < 30% remove one units.
- Scheduled actions
- Anticipate scaling based on known usage patterns
- e.g. increase minimum capacity on 9am weekdays.
Auto Scaling Groups – Scaling Cooldowns
Cooldown period helps to ensure that the ASG doesn’t launch or terminate additional instances before the previous scaling activity has taken effect.
A scaling specific cooldown period overrides the default period. The default period is 300 seconds so if this is too long, then to reduce costs the period can be decreased to 180 seconds for the scale-in policy.
RDS, Aurora and Elasticache
AWS Relational Database Service (RDS)
RDS is a managed database service provided by AWS that allows you to create databases in the AWS Cloud
RDS supports the following databases:
- SQL Server
- Aurora (AWS proprietary database)
Advantages of RDS over deploying a DB on EC2
- RDS is a managed service
- Automatic provisioning, updates and patching
- Continuous backups and restore to specific timestamps (Point in time restore)
- Monitoring dashboards
- Read replicas to improve read performance
- Multi AZ setup for disaster recovery
- Maintenance windows for upgrades
- Scaling capabilities (vertical and horizontal)
- Storage backed by EBS (gp2 or io1)
Disadvantages of RDS
- You can’t SSH into the underlying EC2 instance since this is a managed service
Backups are automatically enabled by RDS.
- Daily full backup of the database (during the maintenance window)
- Transaction logs are backed-up by RDS every 5 minutes (this gives the ability to restore to any point in time from the oldest backup to 5 minutes ago)
- By default there is a 7 day retention of backups but this can be increased to 35 days.
Snapshots of the DB can also be manually triggered by the user and retain for as long as needed.
RDS – Storage Auto Scaling
When an RDS DB running in AWS is about to run out of storage, RDS will automatically detect that the storage is running out and automatically scale the DB.
Therefore meaning that you will never need to manually scale your DB.
A maximum storage threshold has to be set and then configure RDS to auto scale based on:
- Free storage is less than x% of allocated storage
- Low storage lasts at least x minutes
- And x hours have passed since the last modification
This is useful for applications with unpredictable workloads and the autoscaling is supported for all of the RDS database engines.
RDS Read Replicas VS Multi AZ
Read replicas help to scale reads (scalability).
This is useful in the case where one instance of the DB is not able to handle all of the incoming read requests.
Read replicas can be within AZ, Cross AZ or Cross region.
ASYNC replication will occur so that reads are eventually consistent. The reads are eventually consistent, so it may be that if a read happens too early, then the data may not exist yet.
Replicas can be promoted to their own DB as well and then the application will have to update the connection string to leverage the read replicas – each read replica will have its own DNS name.
RDS Read Replicas – Use Case
You have an application that has an RDS DB.
You want to run a reporting application simultaneously to the the main production application.
Use a read replica for the reporting application. This will mean that the performance of the main production application is completely unaffected.
Read Replicas can only be used with SQL SELECT statements
RDS Read Replicas – Network Cost
Generally with AWS there’s a network cost when data goes from one AZ to another.
However for RDS Read Replicas within the same region, there is no fee for the network cost.
But cross region data transfer will incur a cost.
RDS Multi AZ (Disaster Recovery)
The purpose of Multi AZ is to provide greater availability to the system, so that if one DB goes down, another DB will automatically be available to continue processing requests.
SYNC replication – this provides greater availability for the system.
Every operation performed on the main DB is automatically synced to the standby instance that is in another availability zone.
The provides a failover in case of loss of an AZ, loss of network, instance or storage failure.
The master DB and all the replicas sit under one DNS name which provides automatic app failover.
No manual intervention is required in the apps.
Not used for scaling, only used for availability.
Read replicas can be setup as Multi AZ for Disaster Recovery (DR).
How to change an RDS from Single AZ to Multi AZ
- Zero downtime operation (no need to stop the DB)
- Simply modify the setting for the DB from Single AZ to Multi AZ which will change the DB to perform SYNC replication.
The way this happens internally is:
- A snapshot is taken of the original DB
- A new DB is restored from the snapshot to the Multi AZ
- Synchronization (SYNC replication) is then established between the two databases
RDS Security – Encryption
There are several types of encryption
- At rest encryption
- Possibility to encrypt the master and read replicas with AWS KMS – AES-256 encryption.
- Encryption has to be defined at launch time.
- NOTE if the master is not encrypted then the read replicas can not be encrypted.
- Transparent Data Encryption (TDE) available for Oracle and SQL Server
- In-flight encryption
- SSL certificates to encrypt data to RDS in flight.
- Provide SSL options with trust certificate when connecting to database.
- To enforce SSL:
- PostgreSQL has an option that has to be set in the RDS console
- MySQL has options within the DB. Apply a ‘REQUIRE SSL’ statement to all users in the SQL database.
RDS Encryption Operations
- Encrypting RDS backups
- Snapshots of unencrypted RDS databases are unencrypted.
- Snapshots of encrypted RDS databases are also encrypted.
- An encrypted snapshot can be taken from an unencrypted database.
To encrypt an unencrypted RDS database:
- Create a snapshot of the unencrypted database
- Copy the snapshot and enable encryption for the snapshot
- Restore the database from the encrypted snapshot which will produce an encrypted database
- Then migrate the applications to the new database and then delete the old database
RDS Security – Network and IAM
- Network Security
- RDS databases are usually deployed within a private subnet, not in a public one
- RDS security work by leveraging Security Groups which controls which IP / Security Groups can communicate with RDS
- Access Management
- IAM Policies help control who can manage AWS RDS (through the RDS API) e.g. who can create, delete a database
- Traditional username and password can be used to login to the database
- IAM-base authentication can be used to login to MySQL and PostreSQL
- No password required, just an authentication token that is obtained through IAM and RDS API calls.
- Auth Token has a lifetime of 15 minutes
- The benefits of this is that the network in/out must be encrypted using SSL, IAM centrally manages the users instead of the DB and you can leverage IAM roles and EC2 instance profiles for easy integration.
RDS – Amazon Aurora
Aurora is a proprietary technology from AWS.
Postgres and MySQL are both supported as Aurora DB, meaning that your drivers will work as if Aurora was a Postgres or MySQL database.
Aurora is cloud optimised and is highly optimised.
Aurora storage automatically grows in 10GB increments up to 64TB.
Aurora can have 15 replicas while MySQL only has 5. The replication process is very fast as well.
Failover in Aurora is instantaneous – natively high availability.
Aurora is more expensive than RDS but it is more efficient so could actually work out cheaper when being used at scale.
Aurora Global Database allows you to have cross region replication.
Aurora is highly available and read scalable:
- 6 copies of your data across 3 availability zones.
- 4 copies out of 6 needed for writes
- 3 copies out of 6 needed for reads
- Self healing with peer-to-peer replication (if some data gets corrupted it will correct itself)
- Storage is striped across hundreds of volumes therefore reducing the risk of data loss
Aurora exposes two endpoints to the database, one for writing to the database and one to read from the database. This is regardless of how many replica instances are running under the endpoints.
How does AWS Aurora work?
There is only one Aurora instance that takes writes (master).
If the master breaks then an automated failover will occur in less than 30 seconds.
Along with the master, you can have up to 15 Aurora read replicas and if the master fails then any of these read replicas can be promoted to the master.
The Aurora read replicas support Cross Region Replication
Aurora DB Cluster
Aurora DB has a dedicated writer endpoint and reader endpoint.
The writer endpoint will only ever point to one master.
The reader endpoint in load balanced at the connection level and the read replicas can be auto scaled depending on load.
Feature of Aurora
- Automatic failover
- Backup and recovery
- Isolation and security
- Industry compliance
- Push button scaling
- Automated patching with zero downtime
- Advanced monitoring
- Routine maintenance
- Backtrack: restore data at any point of time without using backups
There are several database configurations but the two main configurations are:
- One writer and multiple readers – general purpose option for most workloads.
- Serverless – simply specify the minimum and maximum amount of resources and Aurora scales automatically. Good for unpredictable or intermittent workloads.
- Similar to RDS because they use the same engine as MySQL and PostgreSQL
- Encryption at rest using KMS
- Automated backups, snapshots and replicas are also encrypted
- Encryption in flight using SSL
- Can also authenticate using IAM token (same method as RDS)
- You are responsible for protecting the instance with security groups
- You can’t SSH into the instance.
Elasticache is a managed service by AWS for Redis or Memcached.
Caches are in memory databases with high performance and low latency.
Caches help to reduce load off of databases for read intensive workloads, since the most common data objects are held in the cache where they are more easily accessible.
This also helps to make an application stateless.
And because it is a managed service, AWS takes care of all operations such as OS maintenance, patching, optimizations, setup, configuring, monitoring, failure recovery and backups.
Elasticache Use Cases
Improve performance and reduce latency:
Make application stateless:
The session data of a user can be stored in a cache to make the application stateless. This means that the user won’t have to log every time they make a request.
ElastiCache – Difference between Redis and Memcached
|Availability zones||Multi-AZ with Auto-failover||Multi-node for partitioning of data (sharding).|
|Read replicas||Has read replicas the scale reads and make it highly available||Not highly available. No replication happening.|
|Data durability||Has data durability use append-only file (AOF) persistence||Non-persistence|
|Backup and restore||Has backup and restore features||No backup and restore|
Caching Implementation Considerations
When caching data in a cache, there are certain things that should be considered.
- Is it safe to cache data?
- The data may be out of date
- Is caching effective for that data?
- Yes – data is changing slowly, few keys are frequently needed
- No – data changing rapidly, all key spaces are frequently needed
- Is the data structured well for caching?
- e.g. key-value data is good
Caching Strategy 1: Lazy Loading / Cache-Aside / Lazy Population
- Only requested data is cached
- Node failures are not fatal since data is stored in the database
- Cache miss will result in 3 extra calls in the request and therefore a noticeable delay in the request for the user. Read penalty
- Stale data – data could be updated in the database but outdated in the cache.
Caching Strategy 2: Write Through
Add or update the cache when the database is updated.
- Data in cache is never stale, reads are quick.
- Write penalty – writes are slower are they now require 2 calls but users expect writes to be slower in general.
- Missing data in the cache until it is added/updated in the DB. Could be combined with Lazy Loading Strategy.
- Cache churn. Every write to the DB will add data to the cache, but there is a high chance that this data will never be read.
Cache Evictions and TTL
Cache eviction can occur in three ways:
- Explicitly deleting an item from the cache
- Item is evicted if the cache is full and the item has not been used recently (LRU – least recently used)
- Set a TTL time to live e.g. delete after 5 minutes
If many evictions are happening then maybe the cache should be scaled up.
ElastiCache Redis Cluster Modes
Redis Cluster Mode Disabled
- One primary node, up to 5 replicas.
- Asynchronous replication between the primary node and the replicas
- The primary node is used for read/write, the other nodes are read-only.
- One shard – all nodes have all the data.
- This guards against data loss if a node fails.
- Multi AZ is enabled by default for failover.
- Helps to scale the read performance of the elasticache cluster.
Reids Cluster Mode Enabled
- Data is partitioned across many shards (helpful to scale writes)
- Each shard has a primary node and up to 5 replica nodes (same as cluster mode disabled)
- Multi-AZ capability for failover
- Up to 500 node per cluster but there are different options:
- 500 shards with single master
- 250 shards with 1 master and 1 replica
- 83 shards with 1 master and 5 replicas
Route 53 is a managed DNS (Domain Name System).
DNS is a collection of rules and records which helps clients understand how to reach a server through its domain name.
In AWS the most common records are:
- A – maps hostname to IPv4
- AAAA – maps hostname to IPv6
- CNAME – maps hostname to hostname
- Alias – maps hostname to AWS resource
Route 53 can use both public domains that you own as well as private internal domains that can be resolved by instances in your VPC.
DNS Records TTL
DNS Records TTL is a way for browsers / clients to cache the response of a DNS query – this helps to not overload the DNS.
A web browser will cache the request response for the TTL time defined.
24 hour would be considered a high TTL – less traffic to the DNS but also many outdated records being cached in browsers.
60 seconds would be considered a low TTL – more traffic to the DNS but records are outdated for less time and would be easier to change the records.
CNAME vs Alias
- Points a hostname to another hostname e.g. api.mydomain.com -> elb.aws.com
- Only works for non-route domains e.g. api.mydomain.com
- Points a hostname to another AWS resource
- Works for both route domain and non-route domain
- Free of charge
- Native health checks built in
Simple Routing Policy
Used when redirecting to a single resource.
You can’t attach health checks to simple routing policy.
Simple Routing Policy can return multiple values to the client, and the client will choose a random value to use. This is called client side load balancing.
Weighted Routing Policy
Controls the percentage % of the requests that will go to a specific endpoint.
Reasons for using weighted routing policy:
- Can test a small percent of the traffic on a new app version.
- Split traffic between regions.
- Can be associated with health checks.
Latency Routing Policy
Redirects to the server that has the least latency to the client.
This is useful when latency is important to the end user.
Latency is evaluated such that users may be directed to different AWS regions. e.g. directing a UK user to US maybe the lowest latency so that route will be chosen.
Route 53 Health Checks
If an instance has failed a health check then Route 53 will not send traffic to that instance.
The default check is 3 times before a target is deemed healthy or unhealthy.
The default health check interval is 30 seconds, this can be reduced to 10 seconds but there will be an increase in cost.
There will be about 15 health checkers to check the endpoint health.
Health checks can be linked to Route 53 DNS queries meaning that Route 53 can dynamically depending on whether some target instances are unhealthy.
Failover Routing Policy
Failover routing policy will route traffic to a different route if the primary target becomes unhealthy.
Failover Routing Policy relies on health checks to check the health of the primary target.
Geolocation Routing Policy
This is different to latency routing policy.
It is routing based on a users location. e.g. traffic from the UK should go to this specific IP.
And there should also be a default policy for traffic where we have not specified a route.
Geoproximity Routing Policy
Route traffic to resources based on the geographic location of users and resources. This allows to shift more traffic to resources based on the defined bias.
Bias values can be altered to change the size of the geographic region. e.g. increase the value to increase the bias and direct more traffic to the resource.
These resources can be:
- AWS resources so the AWS region can be specified
- Or non-AWS resources in which case the longitude and latitude can be specified.
This feature is accessed using the Route 53 Traffic Flow (Advanced)
The use case for this is if you wanted to shift traffic to a specific region. Shift traffic from region to another.
Multi Value Routing Policy
Used when routing traffic to multiple resources and allows your to associate Route 53 health checks with records.
It will return 8 healthy records for each multi-value query, even if there are more than 8 records available.
Multi value is not a substitute for an ELB. But it is helpful to provide some client side load balancing.
Virtual Private Cloud VPC
VPC is a private network within AWS that you deploy your resources.
A VPC is a regional resource.
Subnets allow you to partition your network inside your VPC within availability zones.
A Public Subnet is a subnet within the VPC that is publicly accessible from the internet. It can access and be accessed from the public internet.
A Private Subnet is a subnet not accessible from the internet.
Route Tables define the access to the internet and between subnets.
By default when you create your VPC in AWS, you only start with a public subnet in each AZ. And a VPC per region (default VPC).
Internet Gateways and NAT Gateways
Internet Gateways are what helps the public subnet actually access the internet. Public subnets will have a route to the internet gateway.
Your private subnet may also need to connect to the internet and this is done using NAT Gateways (AWS-managed) or NAT Instances (self-managed).
NAT gateways and instances allow instances in private subnets to access the internet whilst still remaining private.
Network ACL and Security Groups
- NACL (Network Access Control List)
- A firewall that controls traffic to and from the subnet through ALLOW and DENY rules.
- Rules are attached at the subnet level.
- Rules only include IP addresses.
- It is the first mechanism of defence for the subnet.
- Security Groups
- A firewall that controls traffic to and from and ENI (Elastic Network Interface) or an EC2 Instance.
- Security Groups can only have ALLOW rules.
- Rules include IP Addresses and other Security Groups.
In a default VPC the default NACL allows all traffic in and out.
VPC Flow Logs
VPC Flow Logs capture all the IP traffic going into the interfaces.
- VPC Flow Logs
- Subnet Flow Logs
- Elastic Network Interface Flow Logs
This helps to monitor and troubleshoot connectivity issues such as subnet to subnet issues.
If you are having network issues look at the VPC Flow Logs.
It captures all network information from AWS managed interfaces as well.
The Flow Logs can be sent to S3 or CloudWatch Logs.
VPC Peering allows you to connect to VPC’s privately using the AWS network and make them behave as if they are one network.
To connect two VPC’s they must have non-overlapping CIDR (IP Address range).
VPC Peering is not transitive. Each VPC must have it’s own VPC peering to another VPC. e.g. If VPC A and B are peer connected and VPC A and C are connected, VPC B and C will not be able to connect to eachother.
Endpoints allow you to connect to AWS Services using a private network instead of the public WWW network.
By default all AWS services are publicly accessible so your EC2 instance will talk publicly to another AWS service. But if that service is private then you will have to use an endpoint to allow the EC2 instance to talk to the service.
The benefit of endpoints is that they give enhanced security and lower latency to access AWS services.
VPC Endpoint Gateway – S3 and DynamoDB
VPC Endpoint Interface – All other AWS Services.
So VPC Endpoints are useful when you need private access from within the VPC to and AWS Service.
Private connect to AWS service = Use VPC Endpoint
VPC Endpoints are just to access AWS Services privately within your VPC.
Site to Site VPN & Direct Connect
Site to Site VPN and Direct connect allow you to connect an AWS VPC to an on premises Data Centre
- Site to Site VPN
- Connect an on-premises VPN to AWS
- The connection is automatically encrypted
- Goes over public internet
- Very quick and easy to setup
- Direct Connect (DX)
- Establish a physical connection between on-premises and AWS
- The connection is private, secure and fast
- Goes over a private network
- Takes at least one month to setup
Site to Site and Direct Connect cannot access VPC Endpoints.
VPC Cheat Sheet
Three Tier Solution Architecture
LAMP Stack on EC2
- Linux – OS for EC2 Instances
- Apache – Web Server that runs on Linux
- MySQL – Database on RDS
- Php – Application logic running on EC2
Then in addition, some extra features could be added such as:
- Redis / Memcached
- EBS to store local application data and software (EBS drive root)
WordPress on AWS
S3 (Simple Storage Service)
Objects are stored in buckets (like files in a directory).
Buckets must have a globally unique name.
Buckets are defined at the region level.
Bucket naming convention:
- No uppercase
- No underscore
- 3-63 characters long
- Not an IP
- Must start with a lowercase letter or number
Objects in an S3 bucket have a key. The key is the full path to the object e.g.
Object keys are composed of a prefix + object name. e.g.
Object values are the content associated with the object key.
- Max object size is 5TB
- If uploading an object of size greater than 5GB then it must be uploaded in multi-part upload
- Objects can have metadata
- Objects can have tags
- Objects can have Version ID’s
Versioning is a setting enabled at the bucket level.
When you upload a new version of a file with the same key, S3 will not overwrite the exist file, except it will create the new file but increment the version number.
- Files that are not versioned prior to enabling versioning will have version “null”
- Suspending versioning does not delete the previous versions.
When you delete a versioned object it does not delete the files. Instead it adds a delete marker to the object to make it seem as if it has been deleted.
This is useful since these deleted objects can be restored if needed by deleting the delete marker.
Deleting a delete marker or deleting a specific object version is a permanent delete. This will delete the object versions for good.
Objects uploaded to S3 are stored on AWS servers, so you may want to encrypt them if for example you need to adhere to some security standard or want to make sure that no one can access your files.
There are 4 methods of encrypting objects in S3:
- Encrypts objects using keys handled and managed by AWS
- Uses AWS Key Management Service to manage encryption keys
- Users can maintain control of the rotation policy of the encryption keys
- You manage your own encryption keys
- Client side encryption
|SSE – S3 (Server Side Encryption)||Encrypts objects using keys handled and managed by AWS.|
Object is encrypted server side.
|SSE-KMS||Uses AWS Key Management Service to manage encryption keys.|
KMS is advantageous because it provides user control over who has access to the keys and provides an audit trail.
Object is encrypted server side
|SSE-C||Server side encryption.|
Uses data keys fully managed by the customer outside of AWS.
AWS S3 does not store the encryption keys, they are sent as a secret during the request.
HTTPS must be used.
To retrieve the data object, the key must also be provided again.
This can only be done using the CLI and SDK. Can’t be done using the AWS Console.
|Encryption key must be provided in HTTP header with every request.|
|Client Side Encryption||Client encrypts the object before uploading it to S3.|
Amazon S3 Encryption Client library can help with this.
Client must decrypt the data when they retrieve it from S3.
Customer manages the key and encryption lifecycle.
Encryption can be set at both the object level and the bucket level.
Encryption in transit (in-flight) – SSL/TLS
S3 exposes two endpoints:
- HTTP endpoint – not encrypted
- HTTPS endpoint – encryption in flight
HTTPS is the recommended protocol to use and it is mandatory for SSE-C.
- User Based
- IAM Policies – specifying which API calls a specific user can make.
- Resource Based
- Bucket Policies – bucket wide rules from the S3 console. Allows cross account access.
- Object Access Control List (ACL)
- Bucket Access Control List (ACL)
An IAM Principal (user or role) can access an S3 object if the user IAM permissions ALLOW it or the resource policy ALLOWS it and there is no explicit deny (i.e. IAM allows a user but the bucket policy denies the user access).
Explicit Deny takes precedence over a bucket policy.
S3 Bucket Policies
- JSON based policies
- Can be applied to both buckets and objects.
- Actions to set the API to allow or deny.
- The effect is to allow or deny access.
- Principle is the account or the user that that this S3 bucket policy applies to.
Common use cases for S3 bucket policies are:
- Grant public access to the bucket
- Force objects to be encrypted at upload
- Grant access to another account (cross account)
Bucket settings for Block Public Access
This blocks public access to buckets and objects and is done using:
- New access control lists (ACL)
- Any access control lists (ACL)
- New public bucket or access point policies.
This blocks public and cross account access to buckets and objects through any public bucket or access point policies.
S3 Security Miscellaneous
Networking – Supports VPC endpoints so that for example EC2 instances in a private subnet can access S3.
Logging and Audit – S3 access logs can be stored in another S3 bucket. API calls can be logged in AWS CloudTrail.
- MFA Delete. Can be required in versioned buckets to delete objects.
- Pre-Signed URL’s. URL’s that are valid only for a limited time.
S3 can host static websites that will be accessible from a WWW URL.
If you get a 403 (Forbidden) error then it is likely due to the bucket policy denying public reads so it will have to be change to allow public reads.
A bucket policy will have to be added to allow all public get access.
CORS (Cross Origin Resource Sharing)
An Origin is comprised of three things:
- Scheme (protocol http/https)
- Host (domain)
- Port (443 for HTTPS, 80 for HTTP)
CORS is a browser based security mechanism to allow requests to other origins while visiting the main origin.
Same origin =
Different origins =
The web browser by default will block cross origin requests unless the correct CORS headers are set (
If a client does a cross-origin request on an S3 bucket, we need to enable the correct CORS headers.
Specific origins can be specified or all origins can be allowed using *.
S3 Consistency Model
As of December 2020, S3 now has Strong Consistency meaning that now after a successful write of a new object (PUT) or an overwrite or delete of an existing object (PUT/DELETE) subsequent read requests are immediately returns the latest version of the object. (Read after write consistency).
And subsequent list request immediately reflects changes. (List consistency).
S3 Upload – Access Denied
Ensure the user has the correct permissions –
CLI, SDK, IAM Roles and Policies
Policies can be simulated in two ways
- Using the policy simulator tool
- Using AWS CLI dry runs
- Useful to just check permissions without actually running the commands.
- Some AWS CLI Command can be expensive if they succeed e.g. creating an EC2 instance.
--dry-runto simulate API calls.
AWS CLI STS Decode Errors
When running commands using the AWS CLI, often you will get long error messages that don’t mean much.
These error messages can be decoded using the STS command line:
aws sts decode-authorization-message --encoded-message <<Encoded error message>>
This will decide the message and return the detailed error message.
AWS EC2 Instance Metadata
EC2 Instance Metadata allows instances to learn about themselves without having to use an IAM Role for that purpose.
The URL to access this metadata is
This IP is an internal AWS IP and will therefore not work from your computer, only from your EC2 instances.
It will allow you to retrieve the IAM Role Name but it will not allow you to retrieve the IAM Policy.
Metadata = information about the EC2 instance.
Userdata = launch script for an EC2 instance.
The metadata commands are useful for automation.
When the CLI is run on an EC2 instance, the CLI uses the metadata service to get temporary credentials using the IAM Role that’s attached.
Using MFA with the CLI or SDK
To use MFA with the CLI, you must create a temporary session.
This is done through running the
STS GetSessionToken API call.
aws sts get-session-token --serial-number arn-of-the-mfa-device --token-code code-from-token --duration-seconds 3600
Which will return:
Then a new CLI profile can be created on your machine using the credential returned from the
sts get-session-token command.
The SDK allows you to make calls to AWS from your applications using code.
If you don’t specify a region for the SDK, it will default to us-east-1.
AWS Limits (Quotas)
- API Rate Limits
- How many times you can call the AWS api in a row.
- e.g. DescribeInstances API for EC2 has a rate limit of 100 calls per second.
- Exceeding the limits will cause intermittent errors – so implemented exponential backoff.
- For consistent errors – request an API throttling increase.
- Service Quotas (Service Limits)
- How many of a particular service you can run simultaneously.
- e.g. running on-demand standard instances is limited to 1152 vCPU.
- Service limit increase – If you need more CPU then request an increase from AWS through opening a service desk ticket.
- Service quota increase – can be increased using the service quotas api.
Use exponential backoff when getting a ThrottlingException.
It can be applied to any AWS service.
If using the AWS SDK, exponential backoff is automatically part of the SDK.
If using the AWS API then you are responsible for implementing exponential backoff.
Must only implement retries on 5xx server errors and throttling errors. Do not implement retries on the 4xx client errors.
Exponential backoff works by sending a request after 1 second. If it fails then send another after 2 seconds (double). If it fails again send another after 4 seconds (double). If it fails again send another after 8 seconds. Continue until a request succeeds.
This will slowly reduce load on your servers.
AWS CLI Credentials Provider Chain
The CLI will look for credentials in this order:
- Command line options (–region, –output, –profile)
- Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN)
- CLI credentials file (aws configure, ~/.aws/credentials)
- CLI configuration file (aws configure)
- Container credentials (ECS tasks)
- Instance profile credentials (EC2 instance profiles)
AWS SDK Credentials Provider Chain
The SDK will look for credentials in this order:
- Java System Properties – aws.accessKeyId, aws.secretKey
- Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
- Default credential profiles file (~/.aws/credentials)
- Amazon ECS container credentials – for ECS containers
- Instance profile credentials – used on EC2 instances.
AWS Credentials Best Practices
When working within AWS, always use IAM Roles, never hard code credentials into code e.g.
- EC2 Instance Roles for EC2 Instances
- ECS Roles for ECS tasks
- Lambda Roles for Lambda Functions
When working outside of AWS, use Environment Variables, Named profiles etc.
Signing AWS API Requests
When calling the AWS HTTP API, the request needs to be signed so that AWS can identify you and this is done using your AWS credentials (access key and secret key).
Some requests to Amazon S3 don’t need to be signed.
If using the SDK or CLI, the HTTP requests are automatically signed for you.
If you are not using the SDK or CLI then you have to manually sign the request using Signature v4 (SigV4). This will allow you to authenticate yourself with AWS.
There are two ways to execute a request with SigV4:
- Include the signature in the HTTP Header option.
- Include the signature as a Query String option.
Advanced S3 and Athena
MFA forces a user to generate a code on their device before they can perform important operations.
To use MFA-Delete, versioning must be enabled on the S3 bucket.
- You will need MFA to
- Permanently delete an object version
- Suspend versioning on the bucket
- You don’t need MFA for
- Enabling versioning
- Listing deleted versions
Only the bucket owner (root account) can enable/disable MFA-Delete. (Administrator won’t have the correct permissions).
MFA-Delete currently can only be enabled using the CLI.
So to delete an object within the be bucket, you will need to use the CLI or disable MFA-Delete. It can’t be done through the console currently.
S3 Access Logs
For auditing purposes, you can log all requests made to an S3 bucket, from any account, authorized or denied.
The data will be logged to another S3 bucket.
The data logged can be analysed using a tool such as AWS Athena.
When creating a logging bucket it has to be a different bucket to your application bucket. Otherwise you will end up with a logging loop causing the bucket size to grow exponentially.
In order to add logs to the logging bucket, S3 will automatically update the access control list (ACL) to include access to the S3 log delivery group from the application bucket.
S3 Replication – Cross Region Replication (CRR) and Same Region Replication (SRR)
For S3 replication versioning must be enabled.
Buckets can be in different accounts.
The copying is asynchronous.
The buckets must have the correct IAM permissions in S3 to copy the data from one bucket to another bucket.
CRR use case – Compliance, lower latency, replication across accounts.
SRR use case – log aggregation from multiple buckets, live replication between production and test accounts.
- Important notes
- After activating, only new objects are replicated (not retroactive).
- If you delete without a version ID, it adds a delete marker but this is not replicated.
- If you delete with a version ID, it deletes the source but this is not replicated.
- Delete markers can be replicated but this setting must be enabled.
- There is no chaining of replication. i.e. if bucket 1 has replication to bucket 2 which has replication to bucket 3, objects created in bucket 1 are not replicated to bucket 3.
S3 pre-signed URLs
Pre-signed URL’s can be generated using the SDK or the CLI.
By default the pre-signed URL is valid for 3600 seconds.
Users who are given the pre-signed URL inherit the permissions of the person who generated the URL. So they will be able to GET/PUT accordingly.
Use case for pre-signed URL’s
- Allow only logged in users to download a premium video on your S3 bucket (and only want to make the link valid for a short amount of time)
- Allow a changing list of users to download files by generating URL’s dynamically.
- Allow a user to temporarily upload a file to a precise location in a bucket.
S3 Storage Classes and Glacier
- Amazon S3 Standard
- General purpose
- Amazon S3 Standard Infrequent Access (IA)
- Amazon S3 One Zone-Infrequent Access
- Amazon S3 Intelligent Tiering
- Amazon Glacier
- Amazon Glacier Deep Archive
Amazon S3 Reduced Redundancy is deprecated.
|S3 Storage Class||Availability Zones||Durability and Availability||Notes||Use cases|
|S3 Standard – General Purpose||Available across multiple AZ||High durability (99.9999…%)|
High availability (99.99%)
|Can sustain 2 concurrent facility failures (resistant to failures)||– Big Data Analytics|
– Mobile and gaming applications
– Content distribution
|S3 Standard Infrequent Access (IA)||Available across multiple AZ||High durability (99.9999…%)|
High availability (99.9%)
|Required for data that is not frequently accessed, but requires rapid access when needed.|
Can sustain 2 concurrent facility failures (resistant to failures)
Lower cost compared to general purpose.
|As a data store for backups recovery etc.|
|S2 One Zone Infrequent Access (IA)||Data stored in a single AZ||High durability (99.9999…%)|
High availability (99.5%)
|Since the data is only stored in a single AZ, the data is lost if the AZ is destroyed.|
Still has the same low latency and high throughput performance.
Supports SSL for data in transit and encryption at rest.
Lower cost to IA.
|Storing secondary backup copies of on-premise data.|
Storing data that can be recreated.
|S3 Intelligent Tiering||Available across multiple AZ||High durability (99.9999…%)|
High availability (99.9%) over a given year
|Same low latency and high throughput performance of S3 standard.|
Small monthly monitoring and auto-tiering fee.
Automatically move objects between two access tiers based on changing access patterns.
(Automatically move between General purpose and IA)
Resilient against events that impact an entire availability zone.
|Amazon Glacier||High durability (99.9999…%)||Low cost object storage.|
Alternative to on-premise magnetic tape storage
Each item in Glacier is called an Archive (each is up to 40TB in size)
Archives are stored in Vaults.
|Archiving and backups.|
Data that needs to be stored for long term (10s of years)
Difference between Amazon Glacier and Glacier Deep Archive
Amazon Glacier Deep Archive is for even longer term storage than Glacier.
|Storage class||Retrieval time||Minimum storage duration||Cost|
|Glacier||Expedited – (1-5 minutes)|
Standard – (3-5 hours)
Bulk (5-12 hours)
The faster the retrieval the higher the cost
|90 days minimum||Cheap|
|Glacier deep Archive||Standard – 12 hours|
Bulk – 48 hours
|180 days minimum||Cheaper|
S3 Lifecycle Rules
It is recommended to move objects between the S3 storage classes depending on how frequently/infrequently accessed they are.
This can be done manually but that would be a long process.
However you can automate the moving of objects using a lifecycle configuration (lifecycle rules).
- Transition actions – Defines when objects are transitioned to another storage class
- Move objects to Standard IA after 60 days
- Move objects to Glacier after 6 months.
- Expiration actions – configure objects to expire (delete) after some period of time
- e.g. access log files can be set to delete after 365 days.
- Can be used to delete old versions of files (if versioning is enabled)
- Can be used to delete incomplete multi-part uploads.
Rules can be created for a certain prefix e.g.
Rules can be created for certain object tags e.g.
S3 automatically scales to high request rates and has a low latency.
S3 can handle 3,500 PUT/COPY/POST/DELETE requests per second per prefix per bucket.
S3 can handle 5,500 GET/HEAD requests per second per prefix per bucket.
Prefix is the object path e.g.
S3 Performance – KMS Limitation
KMS has limits so if you are using SSE-KMS encryption, you could be affected by those limits.
This is because when you upload a file, S3 calls the GenerateDataKey KMS API.
When you download a file, it calls the Decrypt KMS API.
Improving S3 performance – Uploads
- Multi-part upload
- Recommended for files > 100mb
- Mandatory for files > 5GB
- It will parallelize uploads and speed up transfers
- S3 will join the parts together to create the completed file.
kms:GenerateDataKeypermissions are required to perform multi-part upload.
- S3 Transfer Acceleration
- Transfer the file to an Edge Location
- This will then transfer the data to the S3 bucket in the target region.
- Compatible with multi-part uploads.
Improving S3 performance – Downloads
S3 Byte-Rage Fetches parallelizes GETs by requesting specific byte ranges.
This provides resilience in case of failures and can speed up downloads.
It is also useful if you only want to retrieve part of the file.
For example if you only want the HEAD of the file.
S3 Select and Glacier Select
S3 Select and Glacier Select allows you to retrieve less data by using SQL to perform server side filtering.
It can perform basic SQL statements and can filter by rows and columns.
This is beneficial as it means there is less network traffic, and less CPU cost client side.
The use case would be if you are want to retrieve a subset of data from a CSV file stored in S3. You can use S3 Select to get S3 to filter the data in the CSV and only return the subset of data that you want and not the whole CSV file.
S3 Event Notifications
Event notifications can be triggered when certain operations are performed in an S3 bucket e.g.
This is done using Object Rules and object name filtering is also possible within these rules.
A use case would be to generate thumbnails of images uploaded to S3.
The target for S3 Event notifications are:
- Lambda Function
Usually event notifications are instance but sometimes they can take a minute or longer.
- If two writes are made to a single non-versioned object at the same time it is possible that only one event notification will be triggered.
- To ensure that an event notification is sent every single time a successful write occurs then versioning needs to be enabled.
In order for event notifications to trigger a target e.g. add an item to a SQS queue, the correct access policies need to be in place. e.g. S3 needs permissions to write to SQS.
AWS Athena is a serverless service to perform analytics directly against S3 files.
Uses SQL to query the files.
It has a JDBC / ODBC driver to connect your applications to it.
You are charged per query and the amount of data scanned.
Supports a wide range of file types such as CSV, JSON etc.
The use cases are Business Intelligence, Analytics, Reporting, analyze and query etc.
So to analyze data directly in S3 you can use Athena.
Glacier Vault Lock
Adopt a WORM (Write Once Read Many)
Once an object is written into Glacier it cannot be changed.
In addition the policy governing access to the Glacier Vault can also be locked.
This is helpful for compliance, auditing and data retention since no one can edit the data or the policy protecting the data.
S3 Object Lock
Adopt a WORM.
Versioning must be enalbed for this to work.
It allows you to block an object version deletion for a specified amount of time.
- Object retention
- Retention Period – specifies a fixed period
- Legal Hold – same protection, no expiry date
- Governance Mode – users can’t overwrite or delete an object version or alter the lock settings unless they have special permissions.
- Compliance Mode – a protected object version can’t be overwritten or deleted by any user, including the root account. When an object is locked in compliance mode, its retention period can’t be shortened.
CloudFront is a Content Delivery Network (CDN) that improves read performance by caching data in edge locations.
CloudFront has benefits such as:
- Reduce load on central resources such as S3 Buckets.
- DDoS protection
- Integration with Shield
- AWS Web Application Firewall
Can expose external and internal HTTPS encrypted connections.
CloudFront will cache responses in a local cache at the Edge Location so that when similar requests come in, it can respond faster to the request by getting the response from the cache.
- S3 Buckets
- For distributing files and caching them at the edge.
- Enhanced security using CloudFront using Origin Access Identity (OAI). Restricts bucket access to only CloudFront.
- CloudFront can be used as an ingress (to upload files)
- Custom Origin (HTTP)
- Application Load Balancer
- EC2 Instance
- S3 Website (must first enable the bucket as a static S3 website)
- Any HTTP backend you want.
S3 As An Origin
ALB or EC2 as an origin
CloudFront Geo Restriction
Geographic restrictions can be enforced on who can access your distribution
- Allow users to access content only if they’re in one of the countries on the approved countries list.
- Prevent users from accessing content if they’re in one of the countries on a blacklist of banned countries.
AWS uses a 3rd party Geo-IP database.
Use Case: Copyright Law restrictions.
Difference between CloudFront and S3 Cross Region Replication
|Feature||CloudFront||S3 Cross Region Replication|
|Locations||Global Edge Network||Must be setup for each region you want replication to occur in|
|File Age||Files are cached using a TTL so they may be slightly out of date||Files are updated in real time|
|Read and Write||Read and Write||Read Only|
|Use Case||Static content that must be available everywhere||Dynamic content that needs to be available at low-latency in a few regions|
Caching can be performed on:
- Session Cookies
- Query String Parameters
The cache lives at each CloudFront Edge Location.
The goal of CloudFront is to maximize the cache hit rate and minimize requests to the origin.
It is common practice in CloudFront to separate Dynamic and Static content in order to maximise cache hits.
TTL Values can be set from 0 Seconds to 1 year. It can be set by the origin using the Cache-Control and Expires Header.
You can invalidate part of the cache using the CreateInvalidation API.
Invalidate the distribution so reset the cache after updating your app users are still seeing the old website.
CloudFront Security – Geo Restriction
You can restrict who can access your distribution based on their location.
CloudFront Security – HTTPs
- Viewer Protocol Policy
- Redirect HTTP to HTTPS
- Or Use HTTPS only
- Origin Protocol Policy (HTTP or S3)
- HTTPS only
- Or Match Viewer (HTTP => HTTP & HTTPS => HTTPS)
S3 bucket websites do not support HTTPS, only HTTP.
CloudFront Signed URL / Signed Cookies
Use Case: You want to distribute paid shared content to premium users all over the world.
This can be achieved using CloudFront Signed URL / Cookie with an attached policy that includes:
- URL expiration
- Private content for the user expire in many years time.
- IP ranges allowed to access the data
- Trusted signers (which AWS accounts can create signed URL’s)
Signed URL – Access to individual files
Signed Cookies – Access to multiple files (one signed cookie for many files)
CloudFront Signed URL vs S3 Pre-Signed URL
|Features||CloudFront Signed URL||S3 Pre-Signed URL|
|Origin||Allows access to a path no matter the origin e.g. S3, EC2, ALB||S3 Bucket only|
|Security||Account wide key-pair, only the root can manage it||Issues the request as the person who pre-signed the url.|
Uses the IAM key of the signing IAM principal
|Filtering||Can filter by IP, path, date, expiration||Limited lifetime|
|Caching||Can leverage the caching features of CloudFront||No caching|
CloudFront Signed URL Process
Two types of signers:
- Trusted Key Group (Recommended)
- Can leverage API’s to create and rotate keys (and IAM for API security)
- An AWS Account that contains a CloudFront Key Pair
- Need to manage keys using the root account and the AWS console
- This is NOT RECOMMENDED because the root account should not be used for anything
- Can’t be automated because there are no API’s to manage this CloudFront key pair
In the CloudFront distribution, create one or more trusted key groups.
Then generate your own public/private key. The private key is used by your application (e.g. EC2) to sign URLs and the public key (uploaded) is used by CloudFront to verify URLs
CloudFront – Pricing
- Cost of data out per edge location varies.
- You can reduce the number of edge locations for cost reduction. There are three price classes
- Price Class All: All regions – best performance but more expensive
- Price Class 200: Most regions but excludes the most expensive regions
- Price Class 100: Only the least expensive regions (North America and Europe)
CloudFront – Multiple Origin
Use Case: You may want to route to different origins based on the Content-Type or path etc.
CloudFront – Origin Groups
To increase high-availability and do failover in case one origin has failed.
Origin Groups consist of one primary and one secondary origin – If the primary groups fails, then CloudFront will failover to the second origin group.
CloudFront – Field Level Encryption
Protect user sensitive information through the application stack.
Adds an additional level of security along with HTTPS.
Sensitive information encrypted at the edge closest to the user.
Uses asymmetric encryption.
- How it works
- Specify a set of fields in POST request that you want to be encrypted (up to 10 fields)
- Specify the public key to encrypt them
Docker allows you to package apps into containers. Containers can be run on any OS that can run the Docker engine.
Docker images are stored in Docker Repositories such as Docker Hub or Amazon ECR (Elastic Container Registry).
Containers are managed using Docker’s container management platform.
- ECS – AWS platform
- Fargate – AWS Serverless
- EKS – Managed Kubernetes Platform
ECS Clusters (Classic)
ECS Clusters are a logical grouping of EC2 instances.
These EC2 instances run the ECS agent (Docker container)
The ECS agent registers the instances to the ECS cluster.
The EC2 instances run a special AMI made specifically for ECS.
The EC2 instance will required correct IAM permissions to register with the ECS cluster (ECS Agent)
Creating an ECS Cluster automatically creates an autoscaling group which can be viewed in the EC2 autoscaling dashboard.
ECS Task Definition
Task Definitions are metadata written in JSON to tell ECS how to run a docker container.
- Image name
- Port Binding for the Container and Host
- Memory and CPU required
- Environment Variables
- Networking Information
When creating a task definition a Task Role can be assigned (an IAM role). This is important when troubleshooting because if a Task cannot perform any operations (e.g. pull an image from ECR) then it is missing a task role.
Services allow you to run your task.
It defines how many tasks should run and how they should be run.
It ensures the number of tasks desired is running across the fleet of EC2 instances.
They can be linked to ELB, NLB and ALB if needed.
Service types can be REPLICA which allows you to specify the number of tasks to run. Or it can be DAEMON which will automatically runs 1 task on each EC2 instance of the ECS cluster.
If you try to increase the number of running containers when you have specified the port mapping e.g. 8080:80, then it will not increase the number of running tasks since the 8080 port is already in use. So there will only be 1 running task. The Host Port is already defined i the task definition. So you won’t be able to run more than 1 task.
ECS Service with Load Balancers
When you don’t specify a port mapping, ECS will assign a random port mapping to the container running. The question then becomes how do you direct traffic to all of these running containers when the ports are dynamically changing?
The answer is to use an Application Load Balancer with dynamic port forwarding.
Load Balancers can only be added to a service when creating a new service.
The Application Load Balancer allows containers to use dynamic port mapping (multiple tasks allowed per container instance).
Then the Security Group for the ECS Cluster must be updated to allow traffic from the ALB on any port:
Once this is setup it will allow you to run 4 containers on 2 instances with the ALB automatically directing traffic to those containers.
ECR is a private Docker image repository hosted in AWS.
Access to ECR is controlled using IAM policies. So if an image cannot be pulled it is most likely due to a permissions issue.
How to authenticate with ECR from the CLI:
- AWS CLI v1 login command
$(aws ecr get-login --no-include-email --region eu-west-1)
- The output of the ecr get-login command command should be executed.
- AWS CLI v2 login command
- Uses pipes instead.
aws ecr get-login-password --region eu-west-1 | docker login --username AWS --password-stdin 1234567890.dkr.ecr.eu-west-1.amazonaws.com
- First part of the command gets the password and then pipes it to the second part of the command.
The standard docker push and pull commands can then be used to push and pull images:
Originally, to run containers on AWS you would have to launch an ECS Cluster and create the EC2 instances.
If more capacity is required to scale, then more EC2 instances would have o be added. i.e. you were managing your own infrastructure.
Fargate is serverless.
No need to provision EC2 instances, just create the task definitions and AWS will run the containers for you.
And fargate makes scaling easy, just increase the task number. Fargate means no more managing EC2.
No Host port mapping is required with Fargate, it will do it automatically for you.
- You should put multiple containers into the same task definition if
- Containers share a common lifecycle
- Containers are required to be run on the same underlying host
- Containers are required to share resources
- Containers share data volumes
ECS IAM Roles
EC2 instance runs an ECS Agent, therefore making the EC2 instance part of ECS.
EC2 Instance Profile
- The EC2 instance has an EC2 Instance Profile which is used by the ECS Agent.
- To make API calls to the ECS Service (e.g. register the cluster).
- Send container logs to CloudWatch.
- Pull Docker images from ECR.
EC2 Task Role
Allows tasks to interact with other AWS Services.
- Allows each task to have a specific role with minimum permissions.
- Use different roles for different ECS services you run.
- Task role is defined in the task definition.
So you have an IAM role at the EC2 instance level for the ECS agent and then have task roles at each task level so that each task has the correct permissions
ECS Tasks Placement
When a task of type EC2 is launched, ECS must determine where to place it with the constraints of CPU, memory and available port.
When the ECS Service needs to place a new container on the EC2 instances, it needs to be able to figure out where to place it.
This also is true when the service scales in, it needs to be able to determine which task to terminate.
To assist with this task placement strategies and task placement constraints can be defined.
This is only for ECS ON EC2, NOT Fargate.
ECS Task Placement Process
- Task placement strategies are a best effort.
- When ECS places tasks, it uses the following process to select container instances:
- Identify the instances that satisfy the CPU, memory and port requirements in the task definition.
- Identify the instances that satisfy the task placement constraints.
- Identify the instances the satisfy the task placement strategy
- Select the instance for task placement.
ECS Task Placement Strategies
- Place tasks based on the least available amount of CPU or memory.
- This minimises the number of EC2 instance in use (cost savings)
- Only when an EC2 instance is full will it launch a new EC2 instance.
- Place tasks randomly
- Place the task evenly based on the specified value.
- e.g. if you have EC2 instances in multiple availability zones then spread them evenly across those zones.
Placement strategies can be mixed together e.g. spread on availability zone then on memory (binpack)
ECS Task Placement Constraints
- Place each task on a different container instance.
- Never have two tasks on the same instance.
- Place tasks on instance that satisfy an expression
- Defined using the Cluster Query Language
ECS Auto Scaling
- Service Auto Scaling
- CPU and RAM are tracked in CloudWatch at the ECS Service level
- Target Tracking – target a specific average CloudWatch metric (e.g. CPU usage should be 60% across my service)
- Step Scaling – scale based on CloudWatch alarms
- Scheduled scaling – Scale based on predictable changes
- ECS Service Scaling (task level) does not change the EC2 Auto Scaling (instance level) e.g. if you scale up the ECS Service, then the EC2 Instances will not scale up. It has its own separate scaling.
- Fargate Auto Scaling is much easier to setup because it is serverless.
- ECS Cluster Capacity Provider
- Allows for ECS and EC2 auto scaling at the same time.
- Used in association with a cluster to determine the infrastructure that a task runs on.
- For ECS and Fargate, the FARGATE and FARGATE_SPOT capacity providers are added automatically.
- For ECS on EC2 you need to associate the capacity provider with an auto scaling group. This means that the auto scaling group can add EC2 instances when needed.
- When a task or service is run, you define the capacity provider strategy to provide prioritize in which provider to run.
- This allows the capacity provider to automatically provision infrastructure for you.
ECS Container Termination
- RUNNING state
- Terminating a container in the RUNNING state it is automatically removed and deregistered from the cluster.
- STOPPED state
- Terminating a container in the STOPPED state the container isn’t automatically removed from the cluster.
- It will need to be deregistered manually and then it will no longer appear as a resource in the ECS cluster.
ECS Data Volumes
EC2 Task Strategies
EBS Volumes can be mounted to EC2 instances.
Docker containers running on the EC2 instance can mount the EBS volume and extend their storage capacity.
The problem with this though is that if the task moves from one EC2 instance to another EC2 instance, it won’t be the same EBS volume data.
- Use Case
- Mount data volume between different containers on the same instance.
- Extend the temporary storage of a task
EFS File System
Works for both EC2 Tasks and Fargate tasks.
Ability to mount EFS volumes onto tasks.
Tasks can be launched in any availability zone and they will be able to share the same data in the same EFS file system.
Fargate + EFS = Fully Serverless with data storage without managing servers.
- Use case
- Persistent multi-AZ shared storage for your containers.
Bind Mounts Sharing data between containers
Works for both EC2 tasks (using local EC2 instance storage) and Fargate tasks (get 4GB for volume mounts)
Useful to share an ephemeral storage between multiple containers part of the same ECS task.
Good for sidecar container pattern where the sidecar can be used to send metrics or logs to other destinations. Separation of concerns.
- EC2 instances must be created
/etc/ecs/ecs.configfile must be configured with the cluster name
ECS_ENABLE_TASK_IAM_ROLEmust be enabled in the ECS config file to allow the ECS tasks to endorse IAM roles.
- The EC2 instance must run an ECS agent
- EC2 instances can run multiple containters on the same type provieded that:
- A host port is not specified (only container port)
- You should use an ALB with dynamic port mapping
- The EC2 instance security group must allow traffic from the ALB on all ports.
- ECS tasks must have an IAM role to execute against AWS
- Security groups operate at the instance level, not the task level.
- Where private Docker images are stored.
- Integrated with IAM
- AWS CLI v1 login – run special command
- aws ecr get-login generates a docker login command.
- AWS CLI v2 login command
- Uses a pipe instead
- aws ecr get-login-password is piped into docker login
- Docker push and pull works and need the repository name, image name and tag.
- If the EC2 instance cannot pull a Docker image, check the IAM policy.
- Fargate is serverless (no EC2 to manage)
- AWS provisions containers and assigns an ENI (Elastic Network Interface)
- Fargate containers are provisioned by the container spec (CPU/RAM)
- Fargate tasks can have IAM roles to execute actions against AWS
- ECS does integrate with CloudWatch Logs
- Logging needs to be setup at the task definition level.
- Each container will have a different log stream
- The EC2 instance profile needs to have the correct IAM permissions.
- Use IAM Task Roles for your tasks
- Task placement strategies
- BinPack (reduce cost)
- Spread (spread across AZ’s)
- Service Auto Scaling
- Target tracking
- Step scaling
- Cluster Auto Scaling through Capacity Providers allows scaling EC2 instances inline with the ECS service.
Platform as a service. It is a developer centric view of deploying an application on AWS.
Elastic Beanstalk is just a layer but underneath it uses EC2, ASG, ELB, RDS etc.
Beanstalk is free to use, you just pay for the resources used underneath.
Elastic Beanstalk is a managed service. It manages the instance configuration and OS and the deployment strategy. Deployment strategy is configurable.
All the developer is responsible for is the code.
- Three architecture models
- Single Instance Deployment – good for development
- LB + ASG – good for production or pre-production web applications
- ASG only – good for non-web apps in production e.g. background workers.
- Elastic Beanstalk has three components
- Application version – each deployment gets assigned a version
- Environment name – dev, test prod etc.
Beanstalk gives you full control over the lifecycle of environments including deploying application version to environments and promoting application versions to the next environments. And rolling back features to previous application versions.
Application versions can be deployed to many environments.
When configuring advanced options when creating a new environment, once a load balancer has been chosen it cannot be changed. e.g. if you choose NLB, you can’t change it ti ALB in the future
EBS Deployment Modes
- Single Instance (good for development environments)
- 1 EC2 instance
- 1 Elastic IP
- 1 ASG
- A database
- In 1 AZ
- DNS name maps straight to the elastic IP
- High Availability with or without load balancer (good for prod)
- Multiple AZ’s
- Several EC2 instances each with their own Security Groups
- Multi AZ
- Elastic Load Balancer
- ELB exposes DNS name which will be wrapped by the elastic beanstalk DNS name
EBS Deployment Options for Updates
- All at once
- Deploy all in one go
- Fastest but instances aren’t available to serve traffic for a bit.
- Update a few instances at a time (bucket) and then move onto the next bucket once the first bucket is healthy.
- Rolling with additional batches
- Like rolling but spins up new instances to move the batch
- So that the old application is still available.
- Spins up new instances in a new ASG
- Deploys version to these instances
- Then swaps all the instances when everything is healthy
All At Once – Elastic Beanstalk Deployment
- Stops all instances at once, and then deploys the new version all at once
- Fastest deployment
- Application has downtime
- Great for quick iterations
- No additional cost
Rolling – Elastic Beanstalk Deployment
- Application is running below capacity whilst buckets are updated
- Can set the bucket size
- Application will be running both versions simultaneously
- No additional cost
- Deployment time could take long if there are a lot of instances and a small bucket size.
Rolling with additional batches – Elastic Beanstalk Deployment
- Application is running at capacity (sometimes at over capacity)
- Can set the bucket size
- Application is running both versions simultaneously
- Small additional cost
- Additional batch is removed at the end of deployment
- Longer deployment
- Good for production
Immutable – Elastic Beanstalk Deployment
- Zero downtime
- New code is deployed to new instances on a temporary ASG
- High cost – double capacity
- Longest Deployment
- Quick rollback in case of failures (terminate the new ASG)
- Great for prod
Blue / Green – Elastic Beanstalk Deployment
- Zero downtime and release facility
- Create a new “stage” environment and deploy v2 there.
- The new environment (green) can be validated independently and roll back if there are issues.
- Route53 can be setup using weighted policies to redirect a little bit of traffic to the stage environment.
- Using Beanstalk you can “swap URL’s” when done with the environment test.
- A very manual process
Traffic Splitting – Elastic Beanstalk
- Canary testing
- New application version is deployed to a temporary ASG with the same capacity.
- A small amount of traffic is sent to the temporary ASG for a configurable amount of time.
- Deployment health is monitored
- If there’s a deployment failure this triggers automated rollback (very quick)
- No application downtime
- New instances are migrated from the temporary to the original ASG if everything is healthy.
Elastic Bean Stalk CLI
There is a CLI that can be installed called the EB cli which makes it easier to work with Beanstalk.
e.g. eb create, eb status
This is useful for automating deployment pipelines.
Elastic beanstalk Deployment Process
- Describe dependencies e.g. package.json
- Package code as zip and describe dependencies
- Console: Upload the zip file using the console (creates a new app version), then deploy
- CLI: Create a new app version using CLI (automatically uploads the zip) and then deploy
Beanstalk will deploy the zip on each EC2 instance, resolve the dependencies and start the application.
Elastic Beanstalk Lifecycle Policy
- Can store at most 1000 application versions
- If you don’t remove old versions, you won’t be able to deploy new versions
- Phase out old application versions using a lifecycle policy
- Based on time (old versions removed)
- Or based on space (when there are too many versions)
- Versions that are currently used won’t be deleted.
- Option not to delete the source bundle in S3 – prevent data loss
- Removes them from the beanstalk interface.
Zip files containing the code are deployed to Beanstalk.
All the parameters set in the UI can be configured with code using files.
- All configuration files must be in the
.ebextensions/directory in the root of the source code.
- YAML / JSON format.
- File extension must be
- You can modify some default setting using
- You can add resources such as RDS, ElastiCache etc.
- All configuration files must be in the
Resources managed by .
ebextensions get deleted if the environment gets deleted. e.g. RDS
Beanstalk and CloudFormation
Under the hood, Beanstalk uses CloudFormation (infrastructure as code).
This allows you to defined AWS resources in the
.ebextensions file e.g. S3 or anything else.
The beanstalk UI only has limited options for configuring things, but the ebextensions allows you to configure anything using CloudFormation.
Beanstalk cloning allows you to clone an environment with the exact same configuration.
This is useful for deploying a test version of your application. e.g. clone to prod.
- All resources and configurations are preserved
- Load Balancer type and configuration
- RDS database type (but not the data)
- Environment variables
Other settings can be changed after cloning.
Elastic Beanstalk Migration
- Load Balancer
- After creating a Beanstalk environment you cannot change the Load Balancer type (only the configuration)
- If you wanted to change the load balancer, you would have to perform a migration.
- Steps to migrate:
- Create a new environment with the same configuration except the Load Balancer (can’t clone)
- Deploy the application on the new environment
- Re-route traffic to new environment (load balancer) by performing a CNAME swap or ROUTE 53 update
- RDS can be provisioned with Beanstalk which is useful for dev / test environments.
- But this is not ideal for prod as the database lifecycle is now tied to the Beanstalk environment lifecycle. e.g. if you delete prod environment, you lose your prod data.
- The best option for prod is to separately create an RDS database and proved the EB application with the connection string.
- How to decouple RDS from existing beanstalk environment (migrate)
- Create a snapshot of RDS DB (as a safeguard)
- Go to RDS console and protect the database from deletion.
- Create new beanstalk environment without RDS.
- Point new application to existing RDS. (using env variable)
- Perform a CNAME swap (blue/green) or Route 53 update. (confirm it is working)
- Terminate the old environment (RDS won’t be deleted).
- Delete the CloudFormation stack as it will be in the DELETE_FAILED state.
Beanstalk with Docker (Single Docker Container)
- Applications can be run as a single Docker container using
- Dockerfile – EB will build and run the Dokcer container.
- Dockerrun.aws.json (v1) -Describe where already built Docker image is. e.g. DockerHub / ECR. Then provide:
Beanstalk in Single Docker Container does not use ECS, just EC2.
Beanstalk with Multi Docker Container
Beanstalk with Multi Docker Container helps run multiple containers per EC2 instance in EB.
- This will create:
- ECS Cluster
- Multiple EC2 Instances, configured to use the ECS cluster.
- Load Balancer (in high availability mode).
- Task defnitions and execution
To do this, configure a Dockerrun.aws.json (v2) file at the root of the source code. This file will be used to generate an ECS task definition.
The Docker images must be pre-built and stored in ECR for example.
Beanstalk and HTTPS
Load an SSL certificate onto the Load Balancer
- This can be done through either
- Loading the certificate from the Console (EB console, load balancer configuration)
- Can be done from code in the
- The certificate can be provisioned using ACM (AWS Certificate Manager) or the CLI
- Must configure a security group rule to allow incoming traffic on port 444
- Beanstalk can redirect HTTP -> HTTPS. This can done through
- Configuring instances to redirect HTTP to HTTPS
- Configure the ALB (only works on ALB) with a rule to perform the redirect.
- Make sure the health checks are not redirected
Web Server vs Web Worker Environment
If the application performs tasks that are long to complete, offload these tasks to a dedicated worker environment. e.g. processing a video
Decouple application into two tiers.
You can define periodic tasks in a file
Custom platforms are advanced and allow you to define the beanstalk configuration from scratch. e.g. OS, scripts.
The use case for this is if you have an app with a language incompatible with both Beanstalk and doesn’t use Docker.
- To create your own platform:
- Define AMI using
- Build that platform using the Packer software (open source tool to create AMI’s)
- Define AMI using
- The difference between a Custom Platform vs Custom Image (AMI)
- Custom Image is to tweak an existing Beanstalk platform (Python, Java etc.)
- Custom Platform is to create an entirely new Beanstalk Platform