Tags Archives: aws

AWS – Core Services

By editor on November 11, 2022 in

AWS Solutions Architect Associate Level

AWS Core Services

Compute: Services that replicate the traditional role of local physical servers for the cloud, offering advanced configurations including autoscaling, load balancing, and also serverless architectures

Networking: Application connectivity, access control, and enhanced remote connections

Storage: Diverse storage platforms which provide for a range of both immediate accessibility and long-term backup needs

Database: Managed data solutions for use cases that require multiple data formats, eg relational, NoSQL, or caching

Application Management: Monitoring, auditing, and configuring AWS account services and running resources

Security and identity: Services for managing authentication and authorization, data and connection encryption, and integration with third-party authentication management systems

Regions: There are currently 21 AWS regions

Example:

Region name Region Endpoint
US East (Ohio) us-east-2 us-east-2.amazonaws.com

Endpoint addresses are used to access AWS resources remotely from within application code or scripts.

Prefixes like ec2 , apigateway , or cloudformation are added to endpoint addresses to specify particular AWS services.

Eg: for the AWS CloudFormation service in us-east2, the endpoint will be:

cloudformation.us-east-2.amazonaws.com

You then organize your resources in a region using one or more virtual private clouds or VPCs.

A VPC is effectively a network address space within which you can create your own network subnets according to your own requirements and then associate them with AWS availability zones.

Since low-latency response times are extremely important in most cases, some AWS services are provided from speficic edge network locations. Such 0services include Amazon CloudFront, Amazon Route 53, AWS Firewall Manager, AWS Shield, and AWS WAF.

AWS Support Plans

The Basic AWS Support Plan is free with every account and provides access to AWS customer service, plus documentation, white papers, and the AWS support forum. Customer service covers billing and account support issues.

The Developer Support Plan starts at $29/month and adds access for one account holder to a Cloud Support associate, plus limited general guidance and “system impaired” response.

The Business Support Plan from $100/month provides faster-guaranteed response times to unlimited users for assistance with “impaired” systems, personal guidance and troubleshooting, plus a support API.

The Enterprise Support Plan costs from $15,000/month and is for larger organizations with mission-critical operations and covers all the above features, plus direct access to AWS solutions architects for operational and design reviews, your own technical account manager or TAM, and a “support concierge”.

Other sources of practical support

The AWS Community Forums are open to anyone with a valid AWS account at
https://forums.aws.amazon.com

Official AWS documentation is available at https://aws.amazon.com/documentation

For the exam

Understand the AWS platform architecture: the regions and availability zones

This structure allows for replication to enhance availability and also allows for process and resource isolation for security and compliance purposes. Design your deployment to take advantage of those features.

Understand how to use the AWS administration tools

Mostly you will probably use the AWS Administration Console Dashboard but you may prefer to use the AWS CLI for some procedures, and possibly also via an AWS SDK from within your application code.

Understand how to choose an AWS Support Plan

According to individual enterprise needs, make sure you are familiar with the various options and their differences.

AWS – The AWS Well-Architected Framework

By editor on November 9, 2022 in

The six pillars of the AWS Well-Architected Framework

The general design principles and specific AWS best practices are organized into six conceptual areas.

These conceptual areas form the pillars of the AWS Well-Architected Framework.

These six pillars are:

operational excellence,

security,

reliability,

performance efficiency,

cost optimization, and

sustainability.

AWS Well-Architected Tool

AWS provides a service for reviewing your workloads at no charge called the AWS Well-Architected
Tool (AWS WA Tool).

This is an AWS cloud service that provides a consistent process for reviewing and
measuring your architecture using the AWS Well-Architected Framework.

The AWS WA Tool provides recommendations for making workloads more reliable, secure, efficient, and cost-effective.

AWS Well-Architected Labs

There is also a repository of code and documentation called AWS Well-Architected Labs, which provides hands-on experience in implementing best practices with the AWS Cloud. See https://www.wellarchitectedlabs.com

AWS – Solutions Architect – Associate Level Syllabus

By editor on November 8, 2022 in

The five core specifications which should be applied when designing AWS Cloud architectures are:

resilience, high-performance, security and cost-optimization.

These also form the structure and syllabus of the AWS Solutions Architect Associate Diploma.

resilient – meaning multi-tier, fault-tolerant

high-performing – meaning elastic and scalable, both computing and storage

secure – meaning access is secure – both the system configuration as well as the data

cost-optimized – meaning compute, network, and storage solutions are cost-effective

Domain 1: Designing Resilient Architectures 30%

involves:

Designing a multi-tier architecture solution
Designing highly available and/or fault-tolerant architectures
Choosing appropriate resilient storage

Domain 2: Designing High-Performing Architectures 28%

involves:

Identifying elastic and scalable compute solutions for a workload
Selecting high-performing and scalable storage solutions for a workload

Domain 3: Designing Secure Applications and Architectures 24%

involves:

Designing secure access to AWS resources

Domain 4: Designing Cost-Optimized Architectures 18%

involves:

Identifying cost-effective storage solutions
Identifying cost-effective compute and database services
Designing cost-optimized network architectures

To pass and qualify for the Solutions Architect Associate (SAA) Diploma, AWS recommends that you have:

a minimum of one year of hands-on experience designing systems on AWS

hands-on experience using the AWS services that provide compute, networking, storage, and databases

Through studying for the SAA Diploma you will develop:

the ability to define a solution using architectural design principles based on customer requirements

the ability to provide implementation guidance

the ability to identify which AWS services meet a given technical requirement

an understanding of the five pillars of the Well-Architected Framework

an understanding of the AWS global infrastructure, including the network technologies used to connect them

an understanding of AWS security services and how they integrate with traditional on-premises security infrastructure

AWS Solutions Architect Associate Diploma Cheat Sheet

By editor on October 24, 2022 in

AWS Solutions Architect (Associate) Cheatsheet

Look for the key/s in every exam – know what the question is asking and expecting from you.

Some common exam keywords

Highly Available = apply Multiple Availability zones

Fault-Tolerant = apply Multiple Availability zones and an application environment replica to provide for fail-over. Application failure in one AZ should automatically start recovery in another AZ/region.

Real-Time Data = this probably means applying Kinesis services

Disaster = Failover to another region

Long Term Storage = means Glacier or Deep Glacier storage

Managed Service = means S3 for data storage, Lambda for computing, Aurora for RDS and DynamoDB in the case of NoSQL databases

Use AWS Network = usually refers to communications between VPCs and AWS services, so you need to implement VPC gateways and endpoints for communication between these services.

EC2 (Elastic Compute Cloud) instances

These are the basic virtual computer building blocks for AWS.

EC2s are built with predefined EC2 images called AMIs (Amazon Machine Images) – eg various Linux distribs, Windows or MacOS.

You can also build EC2s using your own AMIs.

Types of EC2s

On-Demand Instances – cost is fixed price per hour when instance is running.

Pros: High Availability, you pay only for actual use, good for predictable short-term computing tasks.
Cons: may be expensive over longer time periods.

Use-Cases: development, providinng extra capacity in the event of unforeseen computing demand.

Reserved Instances – fixed term usage contract for 1–3 years.

Pros: cheaper than On-Demand, High Availability, suited to reliable long-term computing tasks.
Cons: – pricier if the instance is underutilized over time.
Use-Cases: Webservers, Relational databases, providing core infrastructure with high availability requirements.

Spot Instances – price fluctuates according to overall current market demand. Offered to highest bidder. Tends to be much cheaper than On-Demand or Reserved Instances.

Pros: very low-cost option, good for short-term computing tasks which can be interrupted without serious loss.
Cons: AWS can terminate the instance any time, depending on market price you may need to wait some time until your offer is accepted.

Use-Cases: providing extra computing capacity for unpredictable workloads that can be interrupted. Eg if running a fleet of Reserved instances to house the webservers, then in the event of sudden website demand spikes you can request Spot instances to cover the additional workload.

Dedicated Hosting – for these you reserve a whole hardware server or server rack for your instance/s – in other words, it is not a virtual machine anymore.

Pros: ideal for high-security and isolated computing tasks.
Cons: the most costly EC2 service.
Use-Cases: Enterprise databases that have high-security needs.

Spot-fleets – these are used for combining all the services listed above into a more cost-efficient structure. For example, you could use a set of Reserved instances to cover your core workload, plus a mix of On-Demand and Spot instances to handle workload spikes.

Spot fleets are especially suited for Auto Scaling Groups.

On-Demand: no long-term commitment, fixed price per second. Best for short-term, irregular workloads that shouldnt be interrupted

Reserved: long-term commitment required of 1 to 3 years. Used for high-uptime workloads or basic core-infrastructure

Spot: uses auction-based bidding for unused EC2 resources — very low cost, but workload can be interrupted if your offer falls below market levels

Dedicated Host: a whole rack or server is dedicated to your account, thus no shared hardware involved — best used only for very high-security requirements due to high cost.

Dedicated Instance: instance is physically isolated at host level, but may share hardware with other instances in your AWS account

EC2 Storage

EBS = Elastic Block Storage

EBS – is a virtual file system drive attached to an EC2 instance. Remember you can’t attach the same EBS to multiple instances.

However you can make a snapshot and then attach the snapshot to another instance.

EBS has the following variants:

EBS(GP2) – General Purpose SSD
Pros: provides balanced work performance and is suitable for most applics
Cons: I/O throughput is max. 16,000 IOPS – make sure you remember this figure for the exam
Use-Cases: serve as system boot drive, good for medium-level workloads

EBS(I01, I02) – Provisioned IOPS SSD
Pros: good for low latency and high throughput up to 64,000 IOPS
Cons: is a pricey storage type
Use-Cases: good for relational databases and high-performance computing (HPC)

EBS(ST1) – Throughput Optimized HDD
Pros: is a lower-cost HDD volume good for frequent access, throughput-intensive workloads
Cons: has low IOPS
Use-Cases: good for use as storage for streaming applications and Big Data storage requirements

EBS(SC1) – Cold HDD
Pros: is the cheapest EBS type
Cons: has lower durability and availability
Use-Cases: – providing storage for files that need high throughput and are infrequently accessed.

EBS volumes are located in a single AZ. If you need to provide High Availability and/or Disaster Recovery thren you need to set up a backup process to create regular snapshots and then store them in S3.

EBS is a provisioned storage type. You are charged for the storage capacity you have defined.

If you require network drive storage for your instances, then you need to attach an EFS volume to your instances.

EFS – Elastic File System
EFS operates only with Linux and serves as a network drive that enables data sharing between EC2 instances.
For Windows instances, you must use “Fx for Windows” as the network file system.
EFS is an automatically scalable storage service and you only pay for the storage space you are using, plus data transfer fees.

There are 2 types of EFS:

General Purpose Performance Mode – this is great for low latency, content management, and storage.
Max I/O Performance Mode – this is good for big data and media processing requirements that need high IOPS, but has a trade-off in terms of higher latency.

Finally, Instance Store

Instance Store is temporary high-performance storage with very high IOPS. Instance Store is ideally suited to cache and session storage. However, it is not a durable storage solution. If the instance is terminated, then all data in the instance store is deleted.

Some definitions to memorize for the exam:

User Data – defines the specific commands to be run during the instance boot process, for example to check for software package updates.

Meta data – this describes instance configuration. The meta data can be accessed through SSH by calling this address http://169.254.169.254/latest/meta-data/

Golden AMI – this is a custom instance image, that has all the necessary software and application pre-installed. Ideal for quick recovery and high availability – for the latter you have to copy the desired Golden AMI to different AZs or regions.

Placement Groups – EC2 instances can be grouped together in 3 different configurations:

Cluster = All instances are located in the same hardware rack and have the highest network bandwidth. Great for machine learning and high-performance computing (HPS)

Spread = Instance are distributed across multiple AZ in order to maintain high availability
Partition = The instances are grouped in smaller groups and each group occupies its own hardware rack. It combines the benefits of Cluster placement with High Availability through a spreading approach.

In the exam, you will be tested on different scenarios for the most suitable EC2 configuration. Here are some general tips:

Use EBS(io1/io2) for application with high IOPS requirements (>16,000)
Use User Data to ‘bootstrap’ scripts during instance startup

Do NOT use Spot instances for critical workloads

Strife for High-Availability by using EC2 instances in multiple Availability Zones and regions. You can have your production environment up in running one region, and the backup instances are either in Stop or Hibernate status in a different region. In case your production AZ becomes unavailable, you can switch to your backup instances and keep the production workload running from different region (Later I’ll explain the concept of Elastic Load Balancing and Auto
Scaling on AWS)

Use IAM roles to give permissions to EC2 instance

And last but not least:
DO NOT STORE ANY CREDENTIALS INSIDE THE EC2 INSTANCE

There are better ways to provide secure access by using IAM roles and AWS Secret Manager.

Simple basic setup with Internet – http/s Webserver (EC2) – MySQL Database (RDS)

Webserver has to be located in the public subnet of your VPC, with the MySQL DB instances located in the private subnet.

Security Group for the Webserver will be open for all incoming requests on port 80 (HTTP) and/or 443 for HTTPS to all IP addresses,

The Security Group for the MySQL DB needs only port 3306 open for the Security Group for communication with the Webserver.

Load Balancer and ASG Setup

Internet – ELB – ASG – Webserver (EC2) – MySQL DB (RDS)

This will generally require HA – high availability – plus fault tolerance, together with the correctly configured security requirements.

Re security: you need to locate the ELB in the public subnet and open the HTTP/s ports, and locate Webservers and SQL DB instance in a private subnet.

The difference between the WAF (Web Application Firewall) and AWS Shield:

WAF is used to protect against attacks such as cross-site scripting. AWS Shield is used to protect against DDOS attacks.

Serverless web application:

Internet – API Gateway – Lambda – DynamoDB or/and S3 Storage

Serverless applications tend to be HA since they operate in multiple AZs.

But you have to add some fault-tolerance to provide for more HA. Generally this involves adding Cloudfront to provide better caching and latency, setting up a Multi-region application for disaster recovery, configuring SQS messaging, and deploying Auto Scaling for DynamoDB.

VPC – Virtual Private Cloud

This is an important area of the exam.

VPC holds all your AWS virtual server infrastructure. Remember that a single VPC can span multiple AZs in a region. But – it can’t span multiple regions. For this you need to use VPC peering to connect VPCs located in different regions.

VPC can contain public and private subnets. They are configured to use specific private IP ranges.

Public subnets are open to the Internet, provided you have installed an IGW or Internet Gateway on your VPC.

Private subnets do not have access to the Internet unless you add a NAT Gateway or NAT Instance in the public subnet and enable traffic from private instances to the NAT Gateway.

Bastion Hosts

Bastion hosts are used to permit secure SSH admin access from the internet into the instances of a private subnet.

Bastion host must be located in a public subnet and the private instance must permit traffic from the Bastion host Security Group on Port 22.

network Access Control Lists (NACLs) = define rules which apply for incoming and outgoing network traffic to/from your subnet.

Remember:

Default NACL config ALLOWS ALL traffic in and out – ie all ports are open,
NACLs can block traffic by explicit Deny, for example you can block specific IPs or IP ranges from accessing your own subnet.

Route Tables = these define network traffic routes in and out of your VPC. Route tables are configured on a subnet basis.

VPC Endpoints = provide access to all AWS services that are not part of the VPC.

Remember there 2 types of Endpoints:

Gateway Endpoint = this enables use of S3 and DynamoDB
Interface Endpoint = this permits use of all other AWS services

CIDR = Classless Inter-Domain Routing.

Know how the CIDR range is defined.

/32 = 1 IP
/31 = 2 IPs
/30 = 4 IPs

and doubles for each step…

remember that the first 5 IPs have to be reserved for the subnet system, so cannot effectively be used, so that means you need at least a /29 CIDR of 8 IPs to do anything!

Some examples:

192.168.10/32 = /32 = IP range for a single IP address 192.168.10.255

192.168.10/31 = /31 = IP range for 2 IP addresses 192.168.10.254 and 192.168.10.255

192.168.10/30 = /30 = IP range for 4 IP addresses 192.168.10.252, 192.168.10.253, 192.168.10.254 and 192.168.10.255

VPC Gateway = allows connectivity to your on-premises network via public internet.

NAT Gateway = enables internet connection from instances located in a private subnet (i.e. often used to facilitate software updates).

NAT Gateway and NAT Instance

Know the difference between NAT Gateway and NAT Instance.

Usually NAT Gateway is preferred as NAT instances have to be managed by the user and don’t allow for auto-scaling.

NAT Gateway by contrast is an AWS managed service which also scales automatically on demand.

For HA or high availability of NAT Gateways where your VPC spans multiple AZs, you need to use individual NAT Gateways located in each AZ.

NAT Gateway & NAT Instance

used to connect to public internet from your private subnet/s

2 different types:

NAT Instance — managed by user with no default auto-scaling, more complex, greater overhead

NAT Gateway — is an AWS Managed Gateway, scalable on demand, less overhead, higher availability vs NAT Instance

VPC peering = peering enables you to establish connections between different VPCs.

Remember that the CIDR-ranges of the connected VPCs must not overlap. And also: VPC peering does NOT work as transitively – ie if you have 3 VPCs (A, B and C) and then set up peering between A and B, and between B and C, the instances located in VPC A cannot communicate with instances in VPC C…. To do this you must also peer VPC A with VPC B.

AWS Transit Gateway = this is a VPC hub that connects multiple VPCs in a web-type network.

In this way you can connect VPCs and your on-premises VPCs and they can all communicate transitively. However – remember also that the CIDR ranges of the VPCs must not overlap.

Make sure you revise all of these topics for the exam:

AWS AppSync
Aurora Serverless
AWS ParallelCluster
AWS Global Accelerator
AWS DataSync
AWS Directory Service
Amazon FSx
Elastic Fabric Adapter
Elastic Network Adapter
Amazon FSx

VPC or Virtual Private Cloud (VPC) Networking

Remember VPCs can span multiple availability zones within one region (but not multiple regions).

They can contain multiple private and public subnets.

A public subnet contains a route to an internet gateway of IGW.

A private subnet by itself has no internet access.

If required then create a NATGW – NAT Gateway, or else EC2 NAT Instance (the old way – not so recommended by AWS) You can also whitelist traffic to these.

For SSH access from internet to a private subnet resource you need to set up a bastion host on a public subnet.
Then configure your Security Groups and Network Access Control Lists to forward traffic on SSH port 22.

Disaster Recovery

Recovery Objectives

Recovery Time Objective — time needed to bring services back online after a major incident
Recovery Point Objective — the data loss measured in time

Backup and Restore – low cost

Pilot Light – for storing critical systems as a template which you can use to scale out resources in event of a disaster

Warm Standby — a duplicate version of business-critical systems that must always run.

Multi-Site — for multiple locations, has lowest RTO Recovery Time Objective and RPO Recovery Point Objective but inccurs highest cost

Route Tables

These decide how traffic can flow within your VPC

Routes contain a destination and a target, e.g. 0.0.0.0/0 (CIDR Destination) and igw-1234567890. The CIDR block covers all the defined IPv4 addresses of the subnet and points them to the IGW.

a default or aws main route table exists for each newly created subnet

note the main route table can’t be deleted

but you can add, modify, and remove routes in this table

a subnet can only have one single route table

but – the same route table can be attached to multiple subnets

route tables can also be attached to your VPG Virtual Private Gateway or IGW Internet Gateway so you can define how the traffic flowing in and out of your VPC is to be routed.

Your VPC always contains an implicit router to which your route tables are attached to.

Virtual Private Gateway (VPC Gateway)

VPG is required if you want to connect your AWS VPC to an on-premises lan.

Network Access Control List (NACLs)

these operate at the subnet level and are STATELESS
you can use NACLs to define block & allow rules

By default NACLs allow traffic in for all ports in both directions

but – return traffic must be explicitly allowed

Security Groups (SGs)

SGs operate at the EC2 instance level and are STATEFUL
SGs only define ALLOW rules – no deny!

Default SG allows communication of components within the security group, ie allowing all outgoing traffic and blocking all incoming traffic

BUT – return traffic is implicitly allowed – in contrast with NACLs

SGs can be attached or removed from EC2 instances at any time without having to stop or terminate the instance.

the rules for SGs must always specify CIDR ranges and never just a single IP

BUT – if you want to have a dedicated IP, e.g. 10.20.30.40 then you need to define it as a “CIDR range” for just a single IP by setting its subnet mask with /32 eg 10.20.30.40/32

VPC Endpoints

used for accessing AWS Services which are not part of your VPC

2 different types of VPC endpoint:

Gateway Endpoint — used for DynamoDB and S3
Interface Endpoint — used for all other Services & works via AWS PrivateLink

VPC Peering

This connects different VPCs, including VPCs of other accounts

Very important: CIDR-ranges of the VPCs must not overlap!

VPC Peering connections are not transitive, thus A ← Peering → B ← Peering → C = means no connection between A and C

Transit Gateway
is a hub for VPCs to connect multiple VPCs with each other effectively into one giant VPC (can include on-premises VPC if desired)

Elastic IP Addresses (EIPs)

can be moved from one instance to another within multiple VPCs if within same region

Amazon Machine Image (AMI)

predefined image, e.g. for Linux or Windows

you can create your own AMIs by launching them with an instance, then modifying and saving it as a new custom AMI

AMIs contain one or more EBS snapshots or for instance-store-backed AMIs a template for root volume (ie the OS, an app server, or applications)

contains launch permissions for volumes to attach to the instance upon booting

AMIs can be EBS-backed or Instance Store-backed

Elastic File System (EFS)

is a Network drive

good for sharing data with multiple instances

EFS can also be attached to Lambda functions

Payment modes for EFS:

2 types:

General Purpose Performance Mode — for low latency requirements
Max I/O Performance Mode — best for high IOPS requirements, e.g. big data or media processing workloads; also has higher latency than General Purpose Mode

Elastic Block Storage (EBS)

EBS is a virtual block-based system drive
Important: can’t be used simultaneously for several instances: only one EC2 per time as it is not a network drive

Can make snapshots of EBS

if EBS volume is the root volume, then by default it gets deleted when EC2 instance is terminated

BUT – non-root volumes do not get terminated when instance is terminated

EBS is created in a SINGLE REGION – is not cross-region

For high availability/disaster recovery you need to take snapshots and save to S3

Pricing for EBS is according to defined storage capacity, not per data volumes transferred

EBS has several types: Know these!

Cold HDD: lowest-cost designed for less-frequently accessed workloads

SC1: up to 250 IOPS (250 MiB/s), with 99.8–99.9% Durability

Throughput Optimised HDD: this is low-cost for frequently accessed workloads

ST1: up to 500 IOPS (500 MiB/s), 99.8–99.9% Durability

General Purpose SSD: mid-balance between price and performance

GP2/GP3: provides up to 16.000 IOPS (250–1000MiB/s) with 99.8–99.9% Durability

Provisioned IOPS (Input/Output Operations Per Second): intended for high performance for mission-critical, low-latency, or high-throughput workloads

IO1/IO2: provides up to 64.000 IOPS (1000MiB/s) with 99.8–99.9% Durability

IO2 Block Express: offers up to 256.000 IOPS (4000MiB/s) and 99.999% Durability

Provisioned IOPS can be attached to multiple EC2s – whereas other types don’t support this

The most convenient way of creating backups is to use the Data Lifecycle Manager for creating automated, regular backups

Instance Store

this is an ephemeral storage which gets deleted at EC2 termination or in the event of hardware failure

So use only for a session or cached data
it has very high IOPS

Storage Gateway

gives on-premises site access to unlimited cloud storage

different Storage types:

Stored — use S3 to backup data, but store locally → you are limited in the amount of space you can allocate

Cached — stores all data on S3 and uses local only for caching

and different Storage Gateway Types:

File Gateway — stores files as objects in S3, using NFS and SMB file protocols
Tape Gateway — virtual tape library for storing backups, using iSCSI protocol
Volume Gateway — using EBS volumes, using the iSCSI protocol; data written to those volumes can be backed up asynchronously to EBS snapshots

Elastic Load Balancing (ELB)

Distributes traffic between computing resources.

ELB has four different types:

Classic (CLB): oldest types, works on both layer 4 and layer 7 — no longer featured on AWS exams

Application (ALB): works on layer 7 and routes content based on the content of the request

Network (NLB): works on layer 4 and routes based on IP data

Gateway (GLB): works on layer 3 and 4 — mostly used in front of firewalls

load balancing enhances fault-tolerance as automatically distributes traffic to healthy targets which can also reside in different availability zones

ELB can be either internet-facing (has public IP, needs a public VPC) or internal-only (private VPC, no public IP, can only route to private IP addresses)

EC2 instances or Fargate tasks can be registered to ELB’s target groups

CloudWatch

basic service free
is a monitoring platform integrated with most AWS services

Log events: messages collected by CloudWatch from other services, always with a timestamp
log groups: a cluster of log messages which are related to a service
log streams: fdeeper level of messages, e.g. for specific Lambda micro-container instance or a Fargate task

CloudWatch collects metrics by default from many services, including Lambda, EC2, ECS etc

X-Ray allows for distributed tracing to show how a single request interacts with other services
Alarms can be used to send mails or SMS messages via SNS on triggering of certain events or to trigger actions eg auto-scaling policies

CloudTrail

Monitors and records account activity across AWS infrastructure.

records events in your AWS account as JSON data
you decide which events are tracked by creating trails
a trail will forward your events to an S3 bucket and/or to a CloudWatch log group

CloudTrail records different types of audit events

Management events: infrastructure management operations, e.g. IAM policy adjustments or VPC subnet creations

Data Events: events that retrieve, delete or modify data within your AWS account, e.g. CRUD operations on DynamoDB or a GET for an object in S3

Insight Events: records anomalies in your API usage of your account based on historical usage patterns
you can also define filter rules to NOT track all events of a certain type, but only a subset.

Access Restrictions

are based on Geo-Location

with Signed URLs — require expiration date & time of the URL plus AWS Security Credentials / Bucket Name containing the objects for creation
they restrict access to specific VPCs

S3 stores objects in buckets, Glacier stores objects in vaults

Object Locks: Retention Modes

Governance Mode

users can’t override or delete the objects unless they have specific required rights granted

you protect objects from being deleted by most users, but you can still grant specific users permission to alter retention settings or delete the object if desired

Compliance Mode

a protected object version cannot be overwritten or deleted by any user, this includes your aws root user
its retention mode cannotbe changed, and the retention period can’t be reduced

Placement Groups

A way of grouping your EC2 instances in a certain way to meet specific needs

Spread: distributes instances across availability zones for high availability

Cluster: places instances in same rack for high network bandwidth

Partition: multiple cluster groups, giving a combination of both the above features. high availability through spreading and high bandwidth through clustering

Route 53

This is the AWS DNS service – fully managed by AWS

Failovers

Active-Active Configuration

where you want all of your resources to be available most or all of the time

If a resource becomes unavailable, then route 53 can detect that it’s unhealthy and will stop including it when responding to queries

Active-Passive Configuration

for when you want a primary resource or group of resources to be available most of the time
– and you want a secondary resource or group of resources to be on standby –
in case all the primary resources become unavailable when responding to queries, Route 53 includes only the healthy primary resources

If all the primary resources are unhealthy, then Route 53 includes only the healthy secondary resources in response to DNS queries

Auto-Scaling Policies

remember – these cannot span over multiple regions

different types of policies are available

Target Tracking Policies

Choose a particular scaling metric plus a target value, e.g. CPU utilization

EC2 Auto Scaling looks after creation of CloudWatch alarms that will trigger the scaling policy to activate

You can define the “warm-up” times for when the target tracking won’t be active – eg if your instances start and the CPU util spikes to 100% then you will not want them to auto-scale just because of this sudden short spike in CPU utilization.

Step and Simple Scaling Policies

these scale based on defined metrics

if thresholds are overstepped for the defined number of periods then AutoScaling Policies will activate

Step scaling policies allow you to define steps according to size of the alarm overstepping:

e.g. Scale-out policy [0,10%] -> 0, (10%, 20%] -> +10%, (20%, null] -> +30% with desired 10 instances

if the metric value changes from 50 (desired value) to 60, then the scale-out policy adds 1 more Instance (ie 10% of your desired 10)

if the metric changes later after the “cool down” of the Scaling Policy, for example to 70 in the metric value, then 3 more Instances will be launched (ie 30% of your desired 10)

the same can also be defined for Scale-in policies

Scaling based on SQS

this scaling is based on a metric of an SQS queue, e.g. ApproximateNumberOfMessagesVisible

Scheduled Scaling

this scales your application instances according to a scheduled time which you set

KMS (Key Management Service)

KMS is a service for creating and managing encryption keys used for securely signing your data.

Encryption of AMIs & EBS Volumes

these can be shared across AWS Account boundary by assigning targeted accounts as users of the master encryption key
and is natively integrated with other services eg SQS, S3, or DynamoDB to encrypt data

Server Side Encryption (SSE)

SSE-S3 (Server Side Encryption managed by S3 for the data and the encryption keys

SSE-C (Server Side Encryption managed by the Customer) here you are responsible for your encryption keys

you can also use different encryption keys for different versions of files within an S3 bucket

Amazon recommends regular the rotation of keys by the customer itself as best practice.

SSE-KMS (Server Side Encryption managed by AWS, KMS, and the customer) – here AWS manages the data key but you are responsible for the customer master key stored in KMS (AWS Key Management Service)

S3 (Simple Storage Service) and Glacier

S3 is a durable object storage which asynchronously replicates your data to all AZs within your region

A bucket is the name for an S3 storage container.

There is no limit on the number of files within a bucket

CloudFront

Edge-computing and delivering content closer to your customer’s locations.

AWS distributes your content to more than 225 edge locations & 13 regional mid-tier caches on six continents and 47 different countries

origins define the sources where a distribution retrieves its content if it’s not cached yet
a single distribution can use multiple origins

caching is determined by a cache policy – AWS managed or custom

Lambda@Edge and CloudFront functions allow general-purpose code to run on edge locations closer to the customer
with geo-restrictions, you can set blocking lists for specific countries

CloudFront supports AWS Web Application Firewall (WAF) to monitor HTTP/s requests and control access to content

Databases

DynamoDB
A fully managed NoSQL database service.

a non-relational database, based on Casandra

comes with two different capacity modes

on-demand: scales acc to number of requests, fee per consumed read or write capacity unit
provisioned: you define how many read/write capacity units required, pay fixed price per month

can use auto-scaling policies together with CloudWatch alarms to scale based on workloads
a (unique) primary key is either built via hash key, or via hash key and range key

Global – can be created at any time – and Local – vreated only when your table is created

Secondary Indexes – allow additional access patterns

can use streams to trigger other services like Lambda on create/update/delete events

Time-to-Live (TTL) attribute allows automatic expiry of items

global tables can span multiple regions and automatically syncs data between them

encryption of tables by default: done either via KMS keys managed by AWS or by the customer directly

RDS Relational Database Service DB

choice of db engine:

MariaDB

SQL Server — this does not support read replicas in a separate region

MySQL

supports Multi-AZ deployment

synchronously replicates data between multiple AZs

offers higher availability with failover support
minimal downtime when scaling

Aurora

is MySQL and PostgreSQL compatible, fully managed and serverless

supports read replicas
point-in-time recovery
continuous backups to S3 & replication across AZs

Lambda

serves as core building block for serverless applications.

integrates natively with different services like SQS or SNS
can run on x86 or ARM/Graviton2 architecture

compute-resources (vCPU) are sized based on memory settings

CloudFormation

The fundamental infrastructure-as-code tool at AWS.

templates are the definition of the infrastructure resources that should be created by CloudFormation and how they are composed
for each template, CloudFormation will create a stack which is a deployable unit and contains all your resources

CloudFormation detects change sets for your stacks at deployment time to calculate what create/update/delete command it needs to run via outputs, you can reference other resources dynamically that may not exist yet

Simple Queue Service (SQS)

A queuing service for creating resilient, event-driven architectures.

another core building block for serverless applications
allows you to run tasks in the background

offers different types of queues

First-In-First-Out (FIFO): executes messages in the order SQS receives them

Standard Queues: higher throughput but no guarantee of the right ordering

Dead-Letter-Queues (DLQs) allows you to handle failures and retries

Retention periods define how long a message stays in your queue until it’s either dropped or re-driven to a DLQ

Simple Notification Service (SNS)

is a managed messaging service for sending notifications to customers or other services.

consumers can subscribe to topics and receive all messages published for this topic
two different types, as for SQS: FIFO and Standard

messages can be archived by sending to Kinesis Data Firehose and from then on to S3 or Redshift

Identity and Access Management (IAM)

The basic security building block for all aws applications

ollow best practices:

never use your root user,
ues dedicated IAM users with enable MFA

different entity types: users, groups, policies, and roles

user: end-user, accessing the console or AWS API

group: a group of users, sharing the same privileges

policy: a defined list of permissions, defined as JSON

role: a set of policies, that can be assumed by a user or AWS service to gain all the policies permissions

by default, all actions are denied – must be explicitly allowed via IAM

an explicit deny always overwrites an allow action

AWS – Identity Federation

By editor on October 21, 2022 in

IAM Identity Center (ex AWS-SSO) and IAM

There are two AWS services you can use to federate your users into AWS accounts and applications: AWS IAM Identity Center (the successor to AWS SSO) and AWS Identity and Access Management (IAM).

AWS IAM Identity Center is a good method to define federated access permissions for your users based on their group memberships in a single centralized directory.

AWS IAM Identity Center works with an Identity Provider (IdP) of your choice, eg Okta Universal Directory or Azure Active Directory (AD) via the Security Assertion Markup Language 2.0 (SAML 2.0) protocol.

If you use multiple directories, or want to manage the permissions based on user attributes, then AWS IAM may be a better option.

Note:

IAM Identity Center is the successor to AWS SSO or Single Sign-On. They both perform the same basica functionality, but SSO is now deprecated. However, SSO can still appear in the exam questions.

You have ONE LOGIN for all aws accounts in all aws organizations, as well as apps which are saml 2.o based and EC2 windows instances

The identity provider can be built in via IAM Identity Center or via a 3rd party Idp eg AD, OneLogin etc.

AWS IAM (Identity and Access Management) enables admins to manage access to AWS services and resources within an AWS account securely for what it calls “entities”.

Entities are IAM users created from the AWS IAM admin console, federated users, application code, or other AWS services. Admins can create and manage AWS users and groups directly, and utilize permissions to allow and deny access to AWS resources.

Differences Between IAM and IAM Identity Center

The difference between AWS IAM and AWS IAM Identity Center is that Identity Center manages access for all AWS accounts within an AWS Organization, as well as access to other cloud applications outside of AWS.

AWS IAM supports multiple identity providers per account.

AWS IAM Identity Center by contrast supports only a single identity provider.

This means that with AWS IAM, users need to log in to each AWS account and create a new role(s), or from one AWS account, Account A, create policies that allow a role to be assumed in another AWS account and to set the level of access to resources in Account B.

Using AWS IAM Identity Center, you can reuse existing AWS IAM Identity Center policies and permission sets. Policies and permission sets are defined at an organization level and are applied to groups or users at account level.

If the policies and permission sets already defined are not applicable to these new accounts, then you can create new ones via the AWS IAM Identity Center admin console.

AWS IAM Identity Center is ideal for managing multiple AWS accounts. An added plus of using AWS IAM Identity Center is that any new user added to a group will automatically be granted the same level of access as other members in the group, thus saving admin overhead.

AWS is now encouraging customers to switch from AWS IAM to AWS IAM Identity Center.

IAM Security Tools

IAM Credentials Report (at account-level):

you can generate this report to list all your account users and the status of their credentials

IAM Access Advisor (user-level):

to review policies

shows the service permissions granted to a user and when the services were last accessed

can tell you which resources are shared externally

eg
s3 buckets
iam roles
kms keys
lambda functions, layers
sqs queues
secrets manager secrets

one useful method is to define a zone of trust which can be your aws account or aws organization

and check for access outside the zone of trust. this is very useful to check for security problems

you create an “analyzer with a “zone of trust” in access analyzer

and then run the analyzer.

you will then see the findings displayed

Identity Federation

It lets users from outside of AWS to take on temporary roles for accessing AWS resources.

This is done by assuming an identity provided access role.

they use third party servers for login, and these have to be trusted by aws.

these then perform the authentication and provide the credentials for the user. ie 3rd party authentication: can be ldap, ms-AD, or SAML single sign on or SSO, openID, Cognito.

the users can then directly access aws resources

this is known as identity federation.

An AWS user is an AWS identity created in the AWS IAM or AWS IAM Identity Center admin console. It consists of a name and credentials.

A federated user is a user identity created and centrally managed and authenticated by an external identity provider. Federated users assume a temporary role when accessing AWS accounts and resources.

A role is similar to a user, being an AWS identity with permissions and policies that determine what the identity can and can’t do in an AWS account.

However, instead of being associated with one person, a role is assumable by anyone who requires it and is permitted to use it.

A role does not have the standard usual long-term credentials such as password or access keys associated with it.

Instead, when a user assumes a role, it gives them a set of temporary security credentials valid for that session.

This allows admins to use roles to delegate access to users, applications, or services that wouldn’t otherwise have access to these AWS resources.

SAML Federation

SAML stands for Security Assertion Markup Language 2.0 (SAML 2.0)

this is for enterprises, who use eg AD or ADFS with AWS /SAML 2.0

it provides access to console or cli on aws temporarily

no need to create an iam user for these users

STS Security Token Service

AWS STS is an AWS service that enables you to request temporary security credentials for AWS resources. For example for IAM authenticated users and users that have been authenticated in AWS such as federated users via OpenID or SAML2.0.

You use STS to provide trusted users with temporary access to AWS resources via API calls, the AWS console or the AWS command line interface (CLI)

These temporary security access credentials function like standard permanent security access key credentials granted to IAM users. The only difference is that the life duration of the security access credentials is much shorter.

Usually an application sends an API request to a AWS STS endpoint for credentials. These access credentials are not stored by the user, but rather are created dynamically on request by STS. These STS-generated credentials will then expire after a short duration. The user can then request new ones provided they still have the permission to do so.

Once the generated credentials expire they cannot be reused. This reduces the risk of your resource access becoming compromised and avoids also having to embed security tokens within the application code.

The STS token lifecycle duration is set by you. This can be anything from 15 minutes to 36 hours.

AWS STS security tokens are generally deployed for identity federation, to provide cross-account access and to provide access to resources related to EC2 instances for other applications.

The STS token is valid for up to one hour by default.

ASsumeRole – users use a role within their own account for enhanced security, or alternatively for Cross-Account access to perform roles there.

AssumeRoleWithSAML – this returns credentials for users using SAML procedures

AssumeRoleWithWebIDentity – this returns credentials for users who are using an external IdP eg facebook, google etc

but note: AWS recommends NOT USING this but using Cognito instead

GetSessionToken is for Multi Factor Authentication for a user or aws root user

Most common use case for STS is to assume a role

so, you define an iam role within your account

define which principals ie users can access this iam role

then use STS to retrieve credentials and assume the IAM role you have set – the AssumeRoleAPI

they are valid for 15 mins to 1hr.

Identity Federation Use-Cases

AWS STS enables you to grant access to AWS resources for users that have been authenticated within your own on-premises network. This enterprise identity federation avoids having to create new AWS identities and no new login credentials are needed.

External web identities can be authenticated by a third-party online identity providers (Idps) such as amazon, google, facebook or other open-id connect compatible services.

This web identity federation method also removes the need to distribute long-term security credentials to enable access to your AWS resources.

Enterprise federation can choose between a variety of authentication protocols like SSO and it supports open standards such as SAML (security assertion markup language SAML2.0) with which you can use microsoft active directory federation services ADFS.

Alternatively you can also deploy SAML to build your own authentication service.

Many organisations maintain multiple AWS accounts and they can deploy IAM identities and cross-account roles to allow their users from one account to access resources located in another account. Once the permissions have been delegated to an IAM user, this trusted relationship can then be deployed to request temporary access via AWS STS login credentials.

Using a Custom Identity Broker App – this means enterprises and not SAML! exam!

If you dont have a compatible SAML 2.0 IdP you can use, or dont wish to use one….

then you can program your own custom id broker app. – this is outside AWS eg on premises

this is used to apply your iam policy

it then makes the request to AWS STS, which your users can then use to access AWS API or management console

Note: Identity Federation no longer appears as such in the exam. Instead, it now focuses on Cognito

AWS Cognito

Cognito provides federated identity pools for public applications.

to provide direct access to aws resources from the client side.

You log in to the fed id provider, get temp aws credentials back from the fed id pool, these come with pre-defined iam policy stating their permissions.

Example:

to provide temp access to S3 bucket using a facebook login – use federated identity pools (FIP) and aws cognito.

Note: there is also something called Web Identity Federation – this is an alternative to Cognito, but aws now recommends Cognito instead,

Our app logs into our iDp . in this case facebook login, then we authenticate with the FIP, this gets credentials from the aws STS, and sends them on to the app, which then can use them to access S3 bucket.

Cognito User Pools versus Identity Pools – the official AWS explanation:

User pools are for authentication (identity verification). With a user pool, your app users can sign in through the user pool or federate through a third-party identity provider (IdP).

Identity pools are for authorization (access control). You can use identity pools to create unique identities for users and give them access to other AWS services.

Cognito User Pools (CUP)

CUP creates a simple serverless database of users for apps

CUP then checks the database

Cognito User Pools and Identity Pools appear similar, but they have their differences. Consider what the outcome of the authorization process is in both cases.

CUP or Cognito User Pools provide a JSON Web Token which we use as an authorizer for an existing API.

With Cognito Identity Pools we receive temporary AWS credentials which we use to access AWS resources.

Another way of looking at it:

If you want to access AWS resources directly from the client side eg from a mobile or a web app, then use Cognito Identity Pools (CID).

In short, Identity Pool is a service that issues temporary AWS credentials to users who have authenticated themselves through an identity provider (IdP), which the Cognito Identity Pool has already established a trust relationship with.

CIP is NOT a user database service and it is NOT an identity provider (IdP) — even though its name is Identity Pool.

Both CUP and CIP “federate”, but for different purposes.

CUP — Getting a unified authentication token:

We can use Cognito User Pool (CUP) as our method to federate different identity providers (IdPs).

This makes your system behave in a standard way following the authentication process, no matter via which IdP a user signs in from.

The client receives tokens (id_token, access_token, optionally refresh_token) from CUP — but the client never gets a ‘federated’ IdP token.

With CIP issuing the AWS credentials is done by examining tokens issued from a different IdP.

We can use Cognito Identity Pool (CIP) to federate different identity providers (IdPs).

In this case the user authenticates with the IdP and obtains the token issued from them; it then sends the token to CIP to obtain the AWS temporary credentials.

Otherwise, use Cognito User Pools (CUP) instead.

If you need your users to access AWS resources (beyond API-Gateway / Appsync) directly, then you most likely want to use Cognito Identity Pool.

If you need a low cost, scalable authentication and user database service, then you will most likely want to use Cognito User Pool.

AWS – VPC Gateways

By editor on October 11, 2022 in

VPC Gateways

There are two main ways of connecting on-premises sites to AWS Cloud VPCs:

Customer Gateway – via a dedicated private network.

and

Direct Connect – via a private dedicated network link

Customer Gateway to VPN Gateway

This is used to connect an on-premises site to an AWS VPC.

this is via public internet using an encrypted VPN connection over the internet.

So we need a VGW on the AWS side, connected to the relevant VPC we want to link to.

VGW is Virtual Private Gateway or “VPN Concentrator” device.

If desired, you can customize the ASN Autonomous System Number for the VGW

The CGW or Customer Gateway on the on-premises side can be software-based or alternatively an actual physical gateway device.

Exam Q:

For the CGW: which IP, then if public then you would use your internet routable ip address for your CGW, but – it can have a private ip ie a NAT-enabled… then you would use the public ip of the NAT device/NAT router/gateway.

Also – important exam Q:

you MUST enable Route Propagation for the VPN in the route table of your relevant subnets in the VPC you are wanting to connect to via the link.

plus, if you need to ping your EC2s from on-premises, then you must enable ICMP on the inbound traffic of your security groups – important!

AWS VPN CloudHub Direct Connect DX

Direct Connect DX provides a dedicated PRIVATE connection from your on-premises site to your VPC

you need to use a VPG virtual private gateway on your VPC

You can access public resources eg S3 and private eg EC2 on the same connection.

DX supports ipv4 and ipv6.

Use cases for Direct Connect:

to increase bandwidth, eg when working with large data sets – lowers costs.

to provide more consistency in network experience – eg for applications that use realtime datafeeds.

for hybrid environments – ie on-premises IT and cloud combinations

To set up:

You use a Direct Connect Endpoint at AWS

then you set up a customer router with a firewall on your on-premises site

Types of DX Direct Connect Connection Types:

Dedicated Connection or Hosted Connection.

It takes about 1 month or more to set up the Direct Connect link.

exam Q:

if you want a connection set up quickly ie faster than 1 month, then DX is NOT suitable! – unless there is already a DX link present at the site.

Dedicated Connection:

1,10,100 Gbps capacity possible, request to AWS, then it is completed by DX partners.

you get a dedicated physical ethernet port

takes 1 month+

Hosted Connection:

50mbps, 500 Mbps, to 10Gpbs

takes 1 month+

request made via DX partners.

capacity can be added/removed on demand

Note: data is NOT encrypted, but it is a private link.

But a DX link plus a VPN running on top of the link provides for an IPsec-encrypted private connection

this is good for extra security, but more complicated overhead to set up.

Exam Q:

DX Resiliency:

this is where you have 2 direct connect locations, one connection for multiple locations. This is good for critical workloads

But – for max resiliency locations you must create TWO connection devices at each site, to provide for redundancy.

Exam Q:

Site to Site VPN Connection used as a backup.

if the DX direct connect fails, then you can run a backup DX connection (expensive) or alternatively a site to site VPN CONNECTION.

Remember this for exam!

Also exam Q:

How to enable services in one VPC to access another VPC:

Two possible ways:

1. go via public internet

but managing the access is hard

2. VPC Peering –

you have to create many peering relations if there are multiple VPCs… can be complicated

and it opens up the whole VPC network to another – risky.

whereas you really only want to open up access for one or a few specific services…

Exam Q:

so an alternative option is to use AWS PrivateLink – VPC endpoint Services

AWS PrivateLink – VPC Endpoint Services

Advantage: this is secure and scaleable, and can be used to create access for 1000s of VPCs if needed.

And – it does not require any VPC peering, internet gateways, NAT or route table config.

So, we conceive of it as follows:

we have a Service VPC with our application service running in it

and we have a Customer VPC which has instances or other services which want to access the app service in the Service VPC:

You need a Network Load Balancer for the applic service in the Service VPC

and on the customer VPC side we create an ENI Elastic Network Interface, and this connects privately from ENI to the NLB

AWS Classic Link

AWS Classic Link – this is deprecated but can still come up in the exam…

This is a legacy of the system that existed for AWS before the separation of accounts with VPCs in AWS – all EC2s ran in one network.

Classic Link enables you to connect EC2 Classic Instances to a VPC in your account.

It enables private ipv4 communication.

for this you must create a security group

prior to this you had to use an ENI and public ipv4 – this is now no longer the case if you use Classic Link.

Could come up as a distractor in the exam

AWS Transit Gateway

This is an alternative to VPC Peering. Simpler, and it allows for transitive VPC connections, as they all connect via the same Transit Gateway

can be regional and cross-regional, cross-account, via RAM Resource Access Manager

you can also peer transit gateways across regions

the route tables are used to finely tune or narrow down which VPCs can talk to which other VPCs according to requirements.

It also works with DX Direct Connext Gateway/VPN connections

exam Q:
and supports IP Multicast (not supported by any other AWS service – know this for the exam!)

Another use case for TG;

to increase your bandwidth using ECMP – Equal-cost multi-path routing

connecting multiple VPCs to the TG means you get more total bandwidth available. Of course this also costs more.

VPC Traffic Mirroring

This is a way to do a non-intrusive analysis of our VPC traffic, by routing a copy or mirror of the traffic to security appliances that we run…

so, to do this, we capture the traffic, from source ENI/s, and to targets – ENI/s or network load balancer NLB:

so effectively we are mirroring or sending a copy of our traffic to our NLB..

to do this source and target addresses must be in the same VPC – or different VPCs provided they are using VPC Peering between each other.

IPV6

NOTE that IPv4 can NEVER be disabled in VPCs or subnets on AWS.

but you can enable ipV6 to operate in “dual stack mode”.

All your EC2 instances will get at least a private internal ipv4 and a public ipv6.

They can communicate using either of these to and from the internet via the internet gateway.

This means that if you can’t launch a new EC2 in your subnet, then this may be because there are no ipv4s free left in your subnet.

The solution for this is to create a new ipv4 CIDR in your subnet.

Even if you are using ipv6 for your instances this means you must have ipv4 addresses left in your created range in order to create any EC2s!

Egress-only Internet Gateway (ipv6)

this is used only for ipv6

They are similar to a NAT GW but for ipv6

They allow EC2s in your VPC to make outbound connections over ipv6 while preventing incoming traffic from the internet initiating an ipv6 connection to your EC2s.

exam Q;
Note you MUST UPDATE THE ROUTE TABLES for PRIVATE SUBNETS FOR THIS (not for public subnets)

AWS VPC Peering

By editor on October 10, 2022 in

VPC Peering is a way to privately connect 2 VPCs via the AWS network so they operate as if in the same network – and without going via the public internet.

Important points about VPC Peering – also for exam:

They must not have overlapping CIDRs.

They are NOT transitive – this means you have to connect one-to-one directly,
not via the peering of another VPC.

You must still update the route tables in each VPCs SUBNETS to ensure the EC2s in each VPC subnet can communicate with each other.

You can connect your own AWS account VPCs, and also other account VPCs, and also different regions together using VPC Peering.

You can access or reference the security group of the peered VPC in the other VPC. This is very powerful.

How To Set Up VPC Peering

in AWS Dashboard access:

Virtual Private Cloud -> Peering Connections

Remember to modify the route table of your VPCs

in the source VPC route table:

add route for the CIDR that corresponds to the CIDR of the destination VPC

and then, do the same for the destination VPC roue table (ie vice-versa)

so you have set up a route that runs both ways. This is essential!

VPC Endpoints

VPC Endpoints provide an alternative way for an EC2 instance to connect to another service or instance without going via the public internet.

All AWS services have a public URL – so one way to connect is to go via pubic internet and the public url – but VPC Endpoints provide an alternative way…

We can use VPC Endpoints which operate using AWS PrivateLink.

This is an AWS internal network which you can use instead of going via public internet.

VPC endpoints are redundant and scale horizontally.

They also remove the need for using IGW, NATGW etc to access AWS services from your EC2s.

Remember to check DNS setting resolution for your VPC and also check your route tables so that VPC Endpoints will be able to work correctly.

2 Types of VPC Endpoint

Interface Endpoints – these are powered by Private Link

it supports most AWS services

is charged per hour and per GB of data that passes through the endpoint.

provisions an ENI – private IP address as an entry point – must be attached to a Security Group

It is the preferred method when access is required to AWS services from an on-premises site eg Site to Site VPN or Direct Connect, from a different VPC or different region.

and

Gateway Endpoints

Here you provision a gateway and this must be used as a TARGET in a route table – ie it does NOT use Security Groups

it is only a route table target entry, nothing more!

there are only 2 targets you can use for this – S3 and DynamoDB – important to note!

it is FREE to use.

So which to use?

Gateway Endpoint is usually the question point asked in the exam.

It is free, simple to set up.

VPC Flow Logs

VPC Flow Logs are very useful for analyzing traffic through your interfaces.

There are VPC flow logs, subnet flow logs and ENI elastic network interface flow logs

they can be saved in S3 or cloudwatch logs

they can also capture network traffic info from AWS managed interfaces as well, such as ELB, RDS, ElastiCache, Redshift, NATGW, Transit Gateway etc

you have 2 ways to analyze the flow logs:

using Athena on S3, or

using CloudWatch Logs Insights or set up a CloudWatch Alarm and possibly send an alert message to eg AWS SNS.

AWS Security Groups (SGs) and Network Access Control Lists (NACLs) – What’s the Difference?

By editor on October 10, 2022 in

Security Groups or SGs are an essential part of the task of securing your incoming and outgoing network traffic within a VPC.

However, SGs also have limitations and shouldn’t be thought of as the only line of defence.

AWS provides a range of additional networking security tools, such as Network ACLs, AWS Network Firewall, DNS Firewall, AWS WAF (Web Access Firewall), AWS Shield, as well as monitoring and compliance tools like AWS Security Hub, GuardDuty, and Network Access Analyzer.

Quick Overview of the Main Features of AWS Security Groups

Security groups are access control lists (ACLs) that allow network traffic inbound and outbound from an Elastic Network Interface (ENI). SGs serve as a basic firewall system for all AWS resources to which they are attached.

Note for exam: Security groups implicitly DENY traffic, they only have “allow” rules, and not “deny” rules.

Thus the absence of an “allow” rule will implicitly DENY access

Security group rules are “stateful” – meaning that if a server can communicate outbound to a service, the return traffic to that server will also be automatically permitted.

This behaviour contrasts with Network ACLs which are “stateless”.

Security groups are VPC-specific and thus also region-specific – which means they can only be used within the VPC where they are created.

Sole exception to this is where there is a peering connection to another VPC, in which case they can be referred to in the peered VPC.

Security groups can be applied to multiple instances within a VPC, and can be valid across subnets within that VPC.

When a VPC is first created, a “default” security group is automatically created by AWS for it and within it.

This default SG has no inbound rules and just a single outbound rule, which allows all traffic to any destination (0.0.0.0/0).

If a new resource is launched within the VPC without association to an SG then it will automatically be assigned to this “default” SG.

By default, a new security group attached to a group of instances does not allow these instances to communicate with each other. For this to be possible you need to create new inbound and outbound rules, and then define the source and destination as the security group itself.

Note that SGs are allocated to the actual Elastic Network Interface device (ENI) which is attached to an EC2 instance, as opposed to the EC2 / RDS instance itself.

You can assign up to five SGs to each Elastic Network Interface. Thus an EC2 instance with multiple ENIs could actually therefore have more than five security groups assigned to it, though this is not best practice.

Best Practices for Deploying Security Groups

Avoid using the “default” security group

The “default” security group should not be used for active AWS resources. The reason is that new AWS resources could be inadvertently assigned to it and so be permitted undesired access to your resources. Since you cannot delete the default SG, you should instead delete all inbound and outbound rules in the “default” SG instead.

Then create new SGs for your AWS resources. Then ensure any new AWS resources are assigned to the correct SGs you created.

Keep the number of your created SGs to a minimum

The AWS EC2 Launch Wizard will encourage you to create a new security group for each EC2 instance you create. However, the problem with that is that it can lead to the creation of too many security groups, which then become hard to manage and track.

Instead of relying on the Launch Wizard, you should decide on a strategy for creating security groups based on your actual application and service access needs and then assign them in accordance with your requirements.

Restrict, restrict, restrict

This means you should always strive to ensure all new security groups apply the principle of least privilege. this means for example:

Don’t open inbound traffic for the whole VPC CIDR-IP or subnet range unless absolutely necessary.

Also avoid allowing all IP addresses (0.0.0.0/0) unless absolutely necessary.

Restrict outbound traffic where possible e.g. do not open all ports to the internet, for example, only open HTTP/HTTPS to allow internet webaccess.

If allowing a specific service, only open the ports and protocols required for that service. For example, for DNS, you open port 53 for TCP and UDP, and only to the external DNS provider – for example for Google DNS that would be 8.8.8.8/32.

Open relevant ports and protocols to other security groups, rather than for IP addresses/subnets. This is because AWS recommends using dynamic IP addresses.

Develop a security group strategy

Set a strategy for creating and using security groups based on your application and service needs. Then use AWS tools to enforce this strategy.

Some examples of strategy options:

Decide the best way to set up your security groups.

For example: deploy one security group per service type, such as “mysql-db”, “web”, “ssh-access”,“active-directory”. Then assign the required inbound and outbound ports for that service

Define one security group per application type, such as “web-servers”, “file-servers”, or “db-servers”. Then assign the ports for that application or service.

Define one security group per operating system, such as “linux” or “windows”, then assign the basic ports required for that OS.

Define one or two default SGs to cover access requirements common to all or most of the servers in the VPC. This minimises the total number of security groups required.

Example:
if all instances need outbound using HTTP / HTTPS to access the web
if all instances need an inbound port for eg a monitoring system.

Try to use a naming strategy which provides clarity to help avoid confusion if you manage multiple VPCs. If each VPC has a security group called for example “web-servers”, then it can quickly become difficult to keep track of which is which.

AWS Firewall Manager can help create security group policies and associate them with accounts and resources. It can also be used to monitor and manage the policies for all linked accounts.

AWS Security Hub works with CloudTrail and CloudWatch to monitor and trigger alarms based on security best practice alerts.

One indication to watch out for is the rate of change within SG rules – such as ports being opened and closed again within a very short time period.

AWS Config can be used to ensure compliance with defined best practices. Config Rules can be created to check and alert for non-compliance, and then perform automated remediation steps.

For example, checking for unrestricted security group rules and ensuring the “default” AWS SG has no inbound or outbound rules set.

An EC2 security group plays the role of a firewall. By default, a security group will deny all incoming traffic while permitting all outgoing traffic

You define group behavior by setting policy rules that will either block or allow specified traffic types.

You can update your security group rules and/or apply them to multiple instances.

Security groups control traffic at the instance level.

However, AWS also provides you with network access control lists (NACLs) that are associated with entire subnets rather than individual instances.

Security Groups Simplified

Security Groups are used to control access (SSH, HTTP, RDP, etc.) with EC2. They act as a virtual firewall for your instances to control inbound and outbound traffic.

When you launch an instance in a VPC, you can assign up to five security groups to the instance.

Note that security groups act at the instance level, not the subnet level.

Security groups filters IP & Port according to rules.

This is the basic firewalling system of AWS, this can be modified…

Security Group INBOUND allows inbound port 22 traffic from your computer

Security Group OUTBOUND allows outbound ANY PORT to ANY IP

In other words,

by default in AWS:

All INBOUND traffic is blocked by default
All OUTBOUND traffic is authorized by default

Security groups can be attached to MULTIPLE EC2 instances, not just assigned to a single EC2.

An instance can also belong to multiple security groups.

They are set for a specific region or VPC combination. So they are NOT cross-regional.

Security groups operate OUTSIDE the EC2 not inside it – so if the traffic is already blocked then the EC2 won’t see it. So it is not an “app” running on your EC2 but a service running outside of it.

Best practice is to create a separate security group just for SSH access.

TIP for error-debugging:

if your application gives a “time out” error, then it is not accessible and this means it is most likely a security group issue.

if your application gives a “connection refused” error, then it is an application-internal error or the application is not running.

Security Groups Key Details

Security groups control inbound and outbound traffic for your instances (they act as a Firewall for EC2 Instances) while NACLs control inbound and outbound traffic for your subnets (they act as a Firewall for Subnets). Security Groups usually control the list of ports that are allowed to be used by your EC2 instances and the NACLs control which network or list of IP addresses can connect to your whole VPC.

Every time you make a change to a security group, that change occurs immediately.

Whenever you create an inbound rule, an outbound rule is created immediately. This is because Security Groups are stateful. This means that when you create an ingress rule for a security group, a corresponding egress rule is created to match it. This is in contrast with NACLs which are stateless and require manual intervention for creating both inbound and outbound rules.

Security Group rules are based on ALLOWs and there is no concept of DENY when in comes to Security Groups. This means you cannot explicitly deny or blacklist specific ports via Security Groups, you can only implicitly deny them by excluding them in your ALLOWs list.

Because of the above detail, everything is blocked by default. You must intentionally allow access to certain ports.

Security groups are specific to a single VPC, so you can’t share a Security Group between multiple VPCs. However, you can copy a Security Group to create a new Security Group with the same rules in another VPC for the same AWS Account.

Security Groups are regional and CAN span AZs, but can’t be cross-regional.

Outbound rules exist if you need to connect your server to a different service such as an API endpoint or a DB backend. You need to enable the ALLOW rule for the correct port though so that traffic can leave EC2 and enter the other AWS service.

You can attach multiple security groups to one EC2 instance and you can have multiple EC2 instances under the umbrella of one security group.

You can specify the source of your security group (basically who is allowed to bypass the virtual firewall) to be a single /32 IP address, an IP range, or even a separate security group.

You cannot block specific IP addresses with Security Groups (use NACLs instead).

You can increase your Security Group limit by submitting a request to AWS

Reachability Analyzer

The Reachability Analyzer is an AWS web-dashboard tool you can deploy to check network reachability from a source and to a destination via a specifically named port. There is a cost incurred of currently 10c per check.

This is very useful in debugging any SG or NACL traffic problems.

Elasticache

By editor on September 30, 2022 in

Elasticache is an AWS managed data caching service mainly for databases and applications.

ElastiCache uses one of two open-source in-memory cache engines for its functionality:

Memcached and Redis.

Elasticache is used to reduce traffic overhead for RDS and some other applications. It is extremely fast as db is held in ram memory.

Your cache must have an invalidation strategy defined to ensure only the most currently used data is stored in the cache.

It can also be used to store user sessions for an application for cases where users may be redirected later to different instances of the application, saving having to re-do the user login session.

But it does require code configurations for apps to be able to query the cache.

ElastiCache includes a feature for master/slave replication and multi-AZ, can be used for achieving cross-AZ redundancy and thus high-availability through the use of Redis replication groups.

Memcached

Memcached is an ASCII text file memory object caching system. ElastiCache is protocol compliant with Memcached, thus all the tools used with existing Memcached environments can also be used with ElastiCache. This is the simplest caching model and can also be used when deploying large nodes with multiple cores and threads.

Redis

Redis is an open-source in-memory key-value store that supports information structures such as lists and sorted sets.

Redis can power multiple databases, as well as maintain the persistence of your key store and works with complex data types — including bitmaps, sorted sets, lists, sets, hashes, and strings.

If Cluster-Mode is disabled, then there is only one shard. The shard comprises the primary node together with the read replicas. Read replicas store a copy of the data from the cluster’s primary node.

Elasticache allows for up to 250 shards for a Redis cluster if Cluster-Mode is enabled. Each shard has a primary node and up to 5 read replicas.

When reading or writing data to the cluster, the client determines which shard to use based on the keyspace. This avoids any potential single point of failure.

Implementing ElastiCache

There are three main implementation steps:

Creating an ElastiCache cluster
Connecting to the ElastiCache cluster from an EC2 instance
Managing the ElastiCache environment from the AWS console

Creating an ElastiCache cluster

This involves choosing and configuring the caching engine to be used. This will be either Redis or Memcached. For each caching engine, configuration parameters differ.

Next, we need to choose the location ie in AWS cloud or On-Premise.

AWS Cloud – This uses the AWS cloud for your ElastiCache instances

On-Premises – In this case, you can create your ElastiCache instances using AWS Outpost.

AWS Outposts is a fully managed service that extends AWS infrastructure, services, APIs, and tools to your own on-site infrastructure.

ElastiCache REDIS Replication – Cluster Mode Disabled

There are two possible configuration modes for running ElastiCache and REDIS:

Cluster Mode Disabled, and Cluster Mode Enabled:

In this configuration you run ONE PRIMARY NODE of ElastiCache with up to 5 Read Replicas

Note that uses asynchronous replication to maintain the Read Replicas, so there is a lag.

The primary node is always used for read/write. The other nodes are read-only.

There is just ONE SHARD and all shards hold all the data.

Multi-AZ is enabled by default for failovers.

ElastiCache REDIS Replication – Cluster Mode Enabled

With Cluster Mode Enabled the data is partitioned across MULTIPLE SHARDS

Data is divided across all your shards. This helps especially with scaling write transactions.

Each shard consists of a primary node and up to 5 read replica nodes.

Also has multiple AZ availability

Provides up to 500 nodes per cluster with a single master node.

or 250 nodes with 1 master and 1 replica.

Scaling REDIS with ElastiCache

For “Cluster Mode Disabled”:

Horizontal scaling – you scale out or in by adding or removing read replicas

Vertical scaling – you alter the type of the underlying nodes

Important for exam!

This is done by means of ElastiCache creating a NEW node group with the new type specification for the nodes, then performing a replication to the new node group, and then finally updating the DNS records so that they point to the new node group and not any longer to the old original node group before scaling.

For “Cluster Mode Enabled”:

this can be done in two different ways – online, and offline:

Online: no interruption to service no downtime, but can be some performance degredation during the scaling.

Offline: service is down, but additional configurations are supported

So, when doing horizontal REDIS scaling, you can do online and office rescaling, and you can do resharding or shard rebalancing for this:

Resharding: “resharding” – this means scaling in or out by adding or removing shards.

Shard rebalancing: involves equally redistributing the keyspaces among the shards as balanced as possible.

Vertical Scaling: you are changing to a larger or smaller node type, this is done online only, relatively straightforward.

REDIS Metrics to Monitor

Evictions: this is the number of non-expired items the cache has removed in order to make space for new writes ie the memory was full.

In this case choose an eviction policy to evict expired items eg least recently used items, LRU or scale up to a larger node type with more memory, or else scale out by adding more nodes

CPUUtilization: this monitors CPU usage for the entire host, if too high, then scale up to a larger node type with more memory

SwapUsage: this should not be allowed to exceed 50Mb, if it does then verify you have configured enough reserved memory

CurrConnections: no of current connections – see if a specific app is causing this

DatabaseMemoryUsagePercentage:

NetworkBytesIn/Out & NetworkPAcketsIn/Out

ReplicationBytes: vol of data being replicated

ReplicationLag: how far behind the replica is from the primary node

Some ElastiCache use cases

know these for the exam!

Updating and managing leaderboards in the gaming industry

Conducting real-time analytics on live e-commerce sites

Monitoring status of customers’ accounts on subscription-based sites

Processing and relaying messages on instant messaging platforms

Online media streaming

Performing geospatial processes

Pros and Cons of Using ElastiCache

Pros of ElastiCache

Fully-managed – ElastiCache is a fully-managed cloud-based solution.

AWS takes care of backups, failure recovery, monitoring, configuration, setup, software updating and patches, and hardware provisioning.

Improved application performance – ElastiCache provides in-memory RAM data storage that substantially reduces database query times.

Easily scalable – you can scale up and down with minimal overhead

Highly available – ElastiCache achieves high availability through automatic failover detection and use of standby read replicas.

Cons of ElastiCache

Limited and complex integration – ElastiCache doesn’t provide many easy options for integration. And you can only connect Elasticache to databases and applications hosted by AWS.

High learning curve – the Elasticache user interface is not intuitive and the system requires a high learning overhead to properly understand.

High price – You pay only for what you use but the costs of using Elasticache can swiftly rise according to usage.

Comparison of ElastiCache With Redis, CloudFront, And DynamoDB

ElastiCache is very different to all these services.

AWS ElastiCache versus Redis

ElastiCache is an in-memory cache in the cloud. With very fast retrieval of data from managed in-memory caches, Elasticache improves overall response times, and saves relying wholly on slow disk-based databases for processing queries.

Redis stands for Remote Dictionary Server — a fast, in-memory, open-source, key-value data store that is usually implemented as a queue, message broker, cache, and database.

ElastiCache is developed on open-source Redis to be compatible with the Redis APIs, as well as operating seamlessly with Redis clients.

This means that you can run your self-managed Redis applications and store the data in an open Redis format, without having to change the code.

ElastiCache versus CloudFront

While ElastiCache and CloudFront are both AWS caching solutions, their approaches and framework differ greatly.

ElastiCache enhances the performance of web applications by retrieving information from fully-managed in-memory data stores at high speed.

To do this it utilizes Memcached and Redis, and is able in this way to substantially reduce the time applications need to read data from disk-based databases.

Amazon CloudFront is primarily a Content Delivery Network (CDN) for faster delivery of web-based data through deploying endpoint caches that are positioned closer to the traffic source. This saves too much web traffic from further-flung geolocations from having to source content entirely from the original hosting server.

ElastiCache versus DynamoDB

DynamoDB is a NoSQL fully-managed AWS database service that holds its data on solid state drives (SSDs). These SSDs are then cloned across three availability zones to increase reliability and availability. In this way, it saves the overhead of building, maintaining, and scaling costly distributed database clusters.

ElastiCache is the AWS “Caching-as-a-Service”, while DynamoDB serves as the AWS “Database as a Service”.

Pricing of ElastiCache

To use ElastiCache you have to make a reservation- Pricing for this is based on the caching engine you choose, plus the type of cache nodes.

If you are using multiple nodes (ie replicas) in your cluster, then ElastiCache will require you to reserve a node for each of your cluster nodes.

Difference Between Redis and Memcached

REDIS: similar to RDS

multi AZ with auto failover
read replicas used to scale reads and provide HA.

Data durability

provides backup and restore

REDIS:

Primary use case: In-memory database & cache Cache
Data model: In-memory key-value
Data structures: Strings, lists, sets, sorted sets, hashes, hyperlog
High availability & failover: Yes

Memcached by contrast:

Memcached
Primary use case: Cache
Data model: In-memory key-value
Data structures: Strings, objects
High availability & failover: No

is multi-node data partitioning ie sharding

no HA

non-persistent data

no backup and restore

multi-threaded architecture

Main Points To Remember About REDIS and Memcached

REDIS is for high-availability – memcached has no AZ-failover, only sharding.

Also REDIS provides backup & restore – memcached does not.

Memcached has a multi-threaded architecture, unlike REDIS.

Redis Metrics to Monitor

Evictions: this is the number of non-expired items the cache has removed in order to make space for new writes ie the memory was full.

In this case choose an eviction policy to evict expired items eg least recently used items, LRU

scale up to a larger node type with more memory, or else scale out by adding more nodes

CPUUtilization: this monitors CPU usage for the entire host, if too high, then scale up to a larger node type with more memory

SwapUsage: this should not be allowed to exceed 50Mb, if it does then verify you have configured enough reserved memory

CurrConnections: no of current connections – see if a specific app is causing this

DatabaseMemoryUsagePercentage:

NetworkBytesIn/Out & NetworkPAcketsIn/Out

ReplicationBytes: vol of data being replicated

ReplicationLag: how far behind the replica is from the primary node

Memcached Scaling

Memcached clusters can have 1-40 nodes soft limit

Horizontal scaling: you add or remove nodes from the cluster and use “Auto-discovery” to allow you app to identify the new nodes or new node config.

Vertical scaling: scale up or down to larger or smaller node types

to scale up: you create a new cluster with the new node type

then update your app to use the new cluster endpoints

then delete the old cluster

Important to note that memcached clusters/nodes start out empty, so your data will be re-cached from scratch once again.

there is no backup mechanism for memcached.

Memcached Auto Discovery

automatically detects all the nodes

all the cache nodes in the cluster maintain a list of metadata about all the nodes

note: this is seamless from the client perspective

Memcached Metrics to Monitor

Evictions: the number of non-expired items the cache evicted to allow space for new writes (when memory is overfilled). The solution: use a new eviction policy to evict expired items, and/or scale up to larger node type with more RAM or else scale out by adding more nodes

CPUUtilization: solution: scale up to larger node type or else scale out by adding more nodes

SwapUsage: should not exceed 50MG

CurrConnections: the number of concurrent and active connections

FreeableMemory: amount of free memory on the host

AWS DB Parameters

By editor on September 27, 2022 in

Database or DB parameters specify how your database is configured. For example, database parameters can specify the amount of resources, such as memory, to allocate to a database.

You manage your database configuration by associating your DB instances and Multi-AZ DB clusters with parameter groups. Amazon RDS defines parameter groups with default settings.

A DB Parameter Group is a collection of engine configuration values that you set for your RDS database instance.

It contains the definition of what you want each of these over 400 parameters to be set to.

By default, RDS uses a default parameter group specified by AWS. It is not actually necessary to use a different parameter group.

Each default parameter group is unique to the database engine you select, the EC2 compute class, and the storage allocated to the instance.

You cannot change a default parameter group, so if you want to make modifications then you will have to create your own parameter group.

RDS database engine configuration is managed through the use of parameters in a DB parameter group.

DB parameter groups serve as an effective container for holding engine configuration values that are applied to your DB instances.

A default DB parameter group is created if you make a database instance without specifying a custom DB parameter group. This default group contains database engine defaults and Amazon RDS system defaults based on the engine, compute class, and allocated storage of the instance.

When you create a new RDS database, you should ensure you have a new custom DB parameter group to use with it. If not then you might have to perform an RDS instance restart later on to replace the default DB parameter group, even though the database parameter you want to alter is dynamic and modifiable.

This is the best approach that gives you flexibility further down the road to change your configuration later on.

Creating your own parameter group can be done via the console or the CLI. Self-created parameter groups take as their basis the default parameter group for that particular instance and selected db engine.

After creation, you can then modify the parameters via the console or CLI to suit your needs as you change.

Parameters can either be static or dynamic.

Static means that changes won’t take effect without an instance restart.

Dynamic means a parameter change can take effect without an instance restart.

Dynamic parameters are either session scope or global scope.

Global scope dynamic parameters mean that changes will impact the entire server and all sessions.

Session scope dynamic parameters however are only effective for the session where they were set.

Note however that some parameter variables can have both global and session scope.

In these cases, the global value is used as the default for the session scope and any global change to a parameter that also has a session scope will only affect new sessions.

Another important aspect to bear in mind when creating a DB parameter group:

You should wait at least 5 minutes before creating your first DB instance that uses that DB parameter group as the default parameter group. This allows Amazon RDS to fully complete the create action before the parameter group is used as the default for a new DB instance.

For exam!

Important to know this DB Parameter above all for the exam:

for PostgreSQL and SQLServer:

rds.force_ssl=1

this forces ssl connections to be used

BUT – for MySQL /MariaDB you must instead use a grant select command:

GRANT SELECT ON mydatabase.* TO ‘myuser’@%’IDENTIFED BY ‘…’ REQUIRE SSL;

AWS – Migration of On-Premises Infrastructure to AWS Cloud

By editor on September 22, 2022 in

The Migration Process can be split into three parts:

Before AWS Migration

During AWS Migration

After AWS Migration

AWS Migration: 5 Cloud Migration Steps

These are the 5 principal AWS Migration steps you need to consider:

Planning and Assessment
Migration Tools
AWS Cloud Storage Options
Migration Strategies
Application Migration Options

Planning and Assessment

The planning and assessment phase is divided into:

Financial Assessment
Security & Compliance Assessment
Technical and Functional assessment

Financial Assessment

Before deciding on-premises to cloud migration, you need to estimate the cost of moving data to the AWS cloud. A careful and detailed analysis is required to weigh the financial considerations of on-premises center versus employing a cloud-based infrastructure.

Security and Compliance Assessment

Overall risk tolerance
Main concerns around availability, durability, and confidentiality of your data.
Security threats
Options available to retrieve all data back from the cloud

Classify your data according to these concerns. This will help you decide which datasets to move to the cloud and which ones to keep in-house.

Technical and Functional Assessment

Assess which applications are more suited to the cloud strategically and architecturally.

Points to consider:

Which applications or data should best be moved to the cloud first?
Which data can we transfer later?
Which applications should remain on-premises?
Can we reuse our existing resource management/configuration tools?
What do we do about support contracts for hardware, software, and networking?

For small-scale data migrations

Unmanaged Cloud Data Migration Tools

For simple, low-cost methods for transferring smaller volumes of data:

Glacier command line interface- On-premises data → Glacier vaults
S3 command line interface- Write commands → Data moves directly into S3 buckets
Rsync- Open source tool combined with 3rd party file system tools. Copy data directly → S3 buckets

For large-scale data migrations

AWS Managed Cloud Data Migration tools

For moving larger volumes of data:

how much data to migrate? Which AWS data migration tool is best suited

Migrate petabytes of data in batches to the cloud AWS Import/Export Snowball
Migrate exabytes of data in batches to the cloud AWS Snowmobile
Connect directly to an AWS regional data center AWS Direct Connect
Migrate recurring jobs, plus incremental changes over long distances Amazon S3 Transfer Acceleration

Some Practical Strategies for AWS Migration

Forklift Migration Strategy

This is more suitable for self-contained, tightly-connected or stateless applications. Its a “pick up everything and move it in one go to the cloud” method.

Is best suited to smaller environments.

Hybrid Mixed-Migration Strategy

This involves moving some parts of an application to the cloud while leaving other parts of the application on-premises.

It is best suited to migrating larger systems which run multiple applications. However, it can be more time-consuming to complete the migration in this way.

Configuring and Creating AMI Images

AMIs provide the information needed to launch an EC2 instance.

Online data transfer from on-premises to AWS

Here are the online data transfer options.

AWS Virtual Private Network

There are two options for using AWS VPN:

AWS Site-to-Site VPN
AWS Client VPN

AWS VPN is encrypted, easy to configure and cost-effective for small data volumes. However, it is a shared connection, so not as fast or reliable as other options.

AWS Virtual Private Network (AWS VPN) establishes secure private connection your network to AWS.

AWS VPN is encrypted, easy to configure and cost-effective for small data volumes. However, it is a shared connection, so not as fast or reliable as other options.

AWS Database Migration Service

The AWS Database Migration Service as the name suggests handles database migration to AWS. The big advantage of DMS is that the database remains fully operational and usable during the migration.

AWS S3 Transfer Acceleration

To migrate large quantities of data over longer distances to AWS S3, AWS S3 Transfer Acceleration enables you to do this 50-500% faster yet still using the public internet.

Data is routed to S3 via optimized network paths using Amazon CloudFront Edge Locations situated across the globe. This maximizes available bandwidth. You select this service on the S3 Dashboard console, selecting one of two TA options. The transfer acceleration is then activated without any need for special client applications or additional network protocols.

AWS DataSync

AWS DataSync enables users to automate the migration of on-premises storage to S3 or Amazon EFS and can transfer up to 10 times faster than some open source migration services. It deploys an on-premises software agent which connects to your on-premises storage system via NFS (Network File System) and SMB (Server Message Block) protocols.

DataSync also takes care of much of the transfer overhead such as Running instances, encryption, managing scripts, network optimization, and validating data all while transferring data up to 10 times faster than many open source migration services.

It can be used to copy data via AWS Direct Connect or public internet to AWS, and is suitable for both one-time data migration, and recurring workflows, as well as for automated backup and recovery actions.

AWS Direct Connect

AWS Direct Connect is a dedicated connection from your on-premises to AWS.

As with AWS VPN, Direct Connect provides an encrypted connection between your on-premises environment and AWS.

However, Direct Connect does not use the public internet and instead runs via a private connection it establishes which will be either via a 1 GB or 10 GB fiber-optic Ethernet cable used to connect your router to an AWS Direct Connect router. On other words, the Direct Connect solution is part software and part hardware.

Because of this dedicated connection, Direct Connect is significantly more costly than using just public internet-and-VPN solutions.

But if you need to transfer or stream very large amounts of data back and forth to the AWS Cloud, then a Direct Connect line may be the best solution. However for smaller transfer one-off migrations it is not so suited.

AWS Storage Gateway

Storage Gateway enables users to connect and extend their on-premises applications to AWS storage.

Storage Gateway provides cloud-backed file shares and provides a low-latency cache for on-premises applications to access data in AWS.

This service has three alternative gateways available:

File Gateway: data is stored in S3 using Amazon S3 File Gateway or using fully-managed file shares through Amazon FSx File Gateway.

Tape Gateway: this is a virtual tape library (VTL) which integrates with existing backup software for long-term storage on S3 Glacier and S3 Glacier Deep Archive.

Volume Gateway: this stores data locally, backing up block volumes with EBS snapshots

AWS data transfer pricing

AWS wants to encourage potential customers to use its platform, so generally speaking it doesn’t charge for migrating data to AWS.

However note that there are often charges levied for transferring back out again from AWS.

Generally, the charges for data migration depend on the resources and infrastructure used in facilitating the transfer. This will depend on the method you choose, your region/s used, the instances and other resources you use, and how fast the connection is.

As from April 2022, inter-Availability Zone (AZ) data transfers within the same AWS Region for AWS PrivateLink, AWS Transit Gateway, and AWS Client VPN are now free of charge.

The best way to calculate your exact data transfer costs is to use the AWS Pricing Calculator and the AWS Cost Explorer.

AWS VPN pricing

AWS VPN costs are calculated according to how many hours the connection is active:

$0.05 per Site-to-Site VPN connection per hour and per AWS Client VPN connection per hour for connections to US

AWS Database Migration Service pricing

If you’re using AWS Database Migration Service to transfer existing databases to Amazon Aurora, Redshift, or DynamoDB, then you can enjoy free usage for six months.

After that time, you only pay for the compute resources, ie instances that you use to port databases to AWS, plus any additional log storage space required.

Each DMS database migration instance will include sufficient storage for swap space, replication logs, and data caching to cover the majority of cases.

On-demand EC2 instances are priced by hourly usage, depending on how powerful the instance is, and whether you are choosing single or multiple availability zones for your instances.

Instance pricing is from $0.018 per hour, up to $21.65 per hour for multi-AZ instances with fastest processor performance and lowest network latency.

AWS S3 Transfer Acceleration pricing

Pricing for AWS S3 Transfer Acceleration service is based on the volume of data you are migrating to S3, rather than how long you are using the connection.

examples:

Data accelerated via Edge Locations in the United States, Europe, and Japan: $0.04 per GB

Data accelerated via all other AWS Edge Locations: $0.08 per GB

Transfer Acceleration constantly monitors its own speed, and if speeds are not faster than a standard transfer via public internet then you will not be charged for the service.

AWS DataSync pricing

For AWS DataSync, you are charged according to the amount of data you transfer via the service. This is currently priced at $0.0125 per gigabyte (GB) of data transferred.

AWS Direct Connect pricing

Direct Connect is priced by the hour. There are two cost options according to the capacity of your Dedicated Connection:

1G: $0.30/hour

10G: $2.25/hour

If you wish to transfer data out using Direct Connect, then there are additional charges to pay for this facility.

AWS Storage Gateway pricing

Charges for AWS Storage Gateway are based on the type and amount of storage you use, as well as the requests you make and the volume of data you are transferring out.

Data Transfer out from AWS Storage Gateway service to on-premises gateway device is charged between $0.05-$0.09 per GB.

Data Transfer in via your gateway device to Amazon EC2 costs $0.02 per GB.

Some Tips For Minimizing Data Migration Costs

Keep your data transfer within a single AWS Region and Availability Zone

Utilize cost allocation tags to identify and analyse where you’re incurring your highest data transfer costs

Deploy Amazon CloudFront to reduce EC2 Instance/s to public Internet transfer costs, and utilize CloudFront’s free tier for the first year of use (note this is valid only up to 50 GB of outbound data transfer and 2 million HTTP requests per month)

Reduce the volume of data that you need to transfer whenever possible before starting the migration.

Deploy VPC endpoints to avoid routing traffic via the public Internet when connecting to AWS

AWS suggest the following schema for deciding on which migration method to choose:

Time Overhead for Migrating Data to AWS

This is the formula suggested by AWS to determine how long it will take to transfer data to AWS from your on-premises site.

Number of Days = (Total Bytes)/(Megabits per second * 125 * 1000 * Network Utilization * 60 seconds * 60 minutes * 24 hours)

Let’s consider a very simple example consisting of just one virtual server machine of say 20GB in total size (no separate file server or other devices in this example)

So that will give us following calculation:

don’t forget to convert megabytes to megabits first. So our 20GBytes becomes using the table at https://convertlive.com/u/convert/gigabytes/to/megabits#20 for this:

Total Bytes will be:

20 Gigabytes = 21474836480 Bytes

21 474 836 480 Bytes

that’s just over 21.4 billion Bytes

so Number of Days = (Total Bytes)/(Megabits per second * 125 * 1000 * Network Utilization * 60 seconds * 60 minutes * 24 hours)

Connection & Data Scale Method Duration

Less than 10 Mbps & Less than 100 GB Self-managed ~ 3 days
Less than 10 Mbps & Between 100 GB – 1 TB AWS-Managed ~ 30 days
Less than 10 Mbps & Greater than 1 TB AWS Snow Family ~ weeks
Less than 1 Gbps & Between 100 GB – 1 TB Self-managed ~ days
Less than 1 Gbps & Greater than 1 TB AWS- Managed / Snow Family ~ weeks

Post AWS Migration Stage

After completing the migration process, make sure you run all necessary tests, and confirm everything is working correctly.

In particular you should look at configuring CloudWatch, CloudTrail and other monitoring services, plus AWS Auto Scaling and CloudFront if required.

AWS Security Services

By editor on September 20, 2022 in

Services that provide DDOS Protection on AWS

AWS Shield Standard, free of charge, is activated by default

AWS Shield Advanced – 24×7 premium protetion, fee charged and access to AWS DRP DDOS Response Team v expensive about 3000 USD per month.

AWS WAF filters specific requests based on rules – layer 7 http – for app load balancer, api gateway and CloudFront
you can define web acl : geo block, ip address blocks, sql injection etc

CloudFront and Route 53: uses global edge network, combined with Shield can provide attack mitigation at the edge

You can utilize AWS AutoScaling to leverage up if there is an attack.

You get full DDOS protection by combining Shield, WAF, CloudFront and Route53.

Penetration testing can be carried out by customers for 8 services eg EC2, RDS, CF etc – you don’t need any authorization to do this but you cannot do simulated ddos attacks on your system or dns zone walking on route 53, nor flooding tests

important note:
for any other simulated attacks, contact aws first, to check, otherwise it is not authorized – and could be seen as an infrastructure attack on aws!

AWS Inspector:

chargable, first 15 days free. not cheap. cost per instance or image scanned.

does automated security assessments, eg for EC2

sends reports to security hub and event bridge

leverages the System Manager SSM agent

for Containers pushing to ECR – assesses containers as they are moved to ECR

it is ONLY for EC2 and container infra. But only done when needed.

checks packages against CVE – package vulnerability scan

also does network reachability for EC2

that is all.

Logging on AWS – quick overview

aws services generate a wide range of logs

cloudtrail trails, config rules, cw logs vpc flow logs, elb access logs cloud front logs, waf logs,

exam question!
LOGS can be analyzed using AWS Athena if stored on S3.

you should encrypt logs stored on S3 and control the access to them by deploying iam and bucket policies plus mfa.

and always remember:
don’t log a server that is logging! otherwise you create an endless logging loop!

and move logs to glacier for cost saving

and also use glacier vault which locks the logs so they cant be tampered with.

AWS Guard Duty

this uses intelligent threat discovery and ML learning to detect

no need to install any software, works in the backend, only need to activate, but it chargeable
esp analyses cloudtrail, vpc flow logs, dns logs, kubernetes audit logs, looks for unusual api calls etc

you can set up cloudwatch events rules to connect to labda or sns

exam q
also can protect against cryptocurrency attacks, has a dedicated function for it. – comes up in exam

AWS Macie

a fully managed data security and data privacy service which uses ML pattern matching to protect your data

helps identify and alert esp re PII – personal identifiable information

can notify event bridge

AWS Trusted Advisor – only need to know overview for the exam

no need to install, is a service

core checks and recommendations — available for all customers, these are free

can send you a weekly email notification

full trusted advisor for business and enterprise – fee based
and can then create cloudwatch alarms or use apis

cost optimization

looks for underutilized resources – but cost optimizn is not in the free core checks, so you need to upgrade for this.

performance

ec2s ebs cloud front

security

mfa security used or not, iam key rotation etc, exposed access keys
s3 bucket permissions, security group issues, esp unrestricted ports

fault tolerance — eg abs snapshot age,

service limits

AWS CloudTrail, CloudWatch, and Config Compared

By editor on September 20, 2022 in

CloudTrail is a services which provides governance, compliance, and auditing of your AWS account by logging and monitoring account activities.

What’s the difference between CloudWatch and CloudTrail

AWS CloudWatch

CloudWatch is a monitoring tool used for real-time monitoring of AWS resources and applications. It provides a monitoring service that analyzes the performance of the system.

CloudWatch can be used to detect irregular behaviour in AWS environments. It monitors various AWS resources including EC2, RDS, S3, Elastic Load Balancer, etc. It can also be used with CloudWatch Alarms.

2. AWS CloudTrail

CloudTrail is a service that enables governance, compliance, operational auditing, and risk auditing of your AWS account. It continuously logs and monitors the activities and actions across your AWS account. It provides the event history of your AWS account including data about who is accessing your system. Remediation actions can also be taken by CloudTrail.

While CloudWatch reports on the activity and health and performance of your AWS services and resources, CloudTrail by contrast is a log of all the actions that have taken place inside your AWS environment.

CloudTrail can record API activity in your AWS account and reports an event within 15 minutes of the API call.

It provides auditing services for AWS accounts. In CloudTrail, Logs are saved in an S3 bucket.

However, you can receive notifications of specific CloudTrail events immediately by sending them via the CloudWatch Event Bus.

While CloudTrail only writes to your S3 bucket once every five minutes, it sends events to the CloudWatch Event bus in near real-time as these API calls are observed.

CloudWatch monitors performance. For example, tracking metrics of an EC2 instance or keeping track of your Amazon DynamoDB performance or seeing how Amazon S3 is performing. CloudWatch allows you to collect default metrics for over 70 AWS services.

It also has a “Custom Metrics” feature that enables you to collect a metric that is specifically important to your system. For example, to measure how people are using your application.

AWS CloudTrail

AWS CloudTrail is principally used for auditing API activity, tracking who did what and when, and securely logging this information to Amazon S3 for later analysis.

Thus CloudTrail keeps track of what is done in your AWS account, when, and by whom. For example, with CloudTrail you can view, search, and download the latest activity in your AWS account to check it there are any abnormal or unusual actions and if so, by whom. This type of reporting is called auditing and it is the core service of CloudTrail.

CloudTrail tracks data events and management events:

Data events are object-level API requests made to your resources. For example, when an item is created or deleted in a DynamoDB table.

Management events log changes (mostly creation or deletion changes) to your environment, such as the creation or deletion of the entire DynamoDB itself.

CloudTrail tracks which applications or persons took these actions and stores the details in logfiles. These logfiles are encrypted and stored in S3.

Note that CloudWatch has CloudWatch Alarms which you can configure and metric data is retained for 15 months. CloudTrail on the other hand has no native alarms. However, you can configure CloudWatch Alarms for CloudTrail, but you have to store logs in S3.

In a nutshell:

CloudWatch is for performance. Think of CloudWatch as monitoring application metrics.

CloudTrail is for auditing. Think of CloudTrail as tracking API activity within an account.

AWS Config vs. CloudTrail

In the configuration and monitoring category AWS, there are two major AWS monitoring tools that are similar and are easy to confuse. They are AWS Config and AWS CloudTrail.

Config and CloudTrail are different tools with different purposes.

What is AWS Config?

AWS Config is a service that lets you set configuration rules for your AWS resources to comply with. It then tracks whether the resources comply with those rules.

Whenever a resource has changed, Config records the change in a configuration history in an S3 bucket. It stores a snapshot of the system at a regular period of time set by you. It also has a dashboard that presents an overview of your resources and their configurations.

What is AWS CloudTrail?

CloudTrail is a logging service that records all API calls made to any AWS service. It records the details of the API call such as which user or application made the call, the time and date it happened and the IP address it originated from.

There is also another AWS logging service called CloudWatch Logs, but unlike CloudWatch Logs which reports application logs, CloudTrail reports on how AWS services are being used in your environment.

Where CloudTrail and Config are Similar

Config and CloudTrail have a number of things in common.

Both are monitoring tools for your AWS resources. Both track changes and store a history of what happened to your resources in the past. Both are used for compliance and governance, auditing and security policies.

If you notice something unusual or going wrong with your AWS resources, then chances are you’ll see it reported in both CloudTrail and Config.

Where CloudTrail and Config are Different

Note that AWS Config Rules is not a cheap service. There is no free tier, you pay a fee per config item per region.

Though both often report on the same events, their approach is different. Config reports on what has changed in the configuration, whereas CloudTrail reports on who made the change, and when, and from which IP address.

Config reports on the configuration of your AWS resources and creates detailed snapshots of how your resources have changed.

CloudTrail focuses on the events or API calls behind those changes, focusing on users, applications, and activities performed in your environment.

Where CloudTrail and Config work together

By taking a different approach to the same events, CloudTrail and Config make a good combination. Config is a great starting point for ascertaining what has happened to your AWS resources, while CloudTrail can give you more information from your CloudTrail logs.

Config watches and reports on instances of rules for your resources being violated. It doesn’t actually allow you to make changes to these resources from its own console.

By contrast, CloudTrail gives you more control by integrating with CloudWatch Events to allow you to set automated rule-based responses to any event affecting your resources.

In the case of security breaches, if multiple changes have been made by an attacker in a short period of time, Config might not report this in detail.

Config stores the most recent and important changes to resources but disregards smaller and more frequent changes.

CloudTrail by contrast records every single change in its logs. It also has an integrity validation feature that checks if the intruder or attacker manipulated the API logs to cover their activity track.

Should You Use AWS Config or CloudTrail for Security?

Both Config and CloudTrail have a role to play together. Config records and notifies about changes in your environment. CloudTrail helps you find out who made the change, from where, and when.

A good way to think of it is that AWS Config will tell you what your resource state is now or what it was at a specific point in the past whereas CloudTrail will tell you when specific events in the form of API calls have taken place.

So you ought to use both. Config Rules triggers on a change in the status of your system, but it will often only give you an update on the state of the system itself.

CloudTrail meanwhile provides you with a log of every event which details everything that has taken place and when and by whom. This helps identify all the causes that may have led to the security problem in the first place.

Remember also that AWS Config Rules does not prevent actions from happening – it is not a “deny”.

But – you can do “remediations” of resources that are identified as non-compliant. This can be done for example via SSM Automation Documents. Config then triggers an auto-remediation action that you define.

Notifications:

you can use EventBridge to receive notifications from Config, from there you can also send the notifications onto eg Lambda functions, SNS or SQS.

AWS Additional Monitoring Tools

By editor on September 17, 2022 in

AWS Config

AWS Config is an AWS fully managed change management solution within AWS. It allows you to track the change history of individual resources and configure notifications when a resource changes.

This is achieved by means of config rules. A config rule represents the desired state that the resource should be in.

Config rules allow you to monitor for systems that fall outside of your set baselines and identify which changes caused the system to fall out of compliance with the baseline. AWS Config is enabled on a per-region basis, so you need to enable it for every region in which you want to use it.

Bear in mind that AWS Config is a monitoring tool and does not actually enforce baselines, nor does it prevent a user from making changes that cause a resource to move out of compliance.

AWS Config enables you to capture the configuration history for your AWS resources, maintain a resource inventory, audit and evaluate changes
in resource configuration, and enable security and governance by integrating notifications with these changes. You can use it to discover AWS resources in your account, continuously monitor resource configuration against desired resource configuration, and check the configuration details for a resource at a given point in time.

AWS Config is used to assess compliance as according to your set internal guidelines for maintaining resource configurations, as well as enabling compliance auditing, security analysis, resource change tracking, and assisting with operational troubleshooting.

AWS Trusted Advisor

AWS Trusted Advisor service analyzes and checks your AWS environment in real-time and gives recommendations for the following four areas:

Cost optimization
Performance
Security
Fault tolerance

Trusted Advisor or TA integrates with AWS IAM so you can control access to checks as well as to categories.

The current status of these checks is displayed in the TA dashboard as follows:

Red: Action recommended
Yellow: Investigation recommended
Green: No problem detected

Where the colour is red or yellow, TA provides alert criteria, recommended actions, and relevant resource details, such as details of the
security groups allowing unrestricted access via specific ports.

Six core checks are available for all AWS customers free of charge.

Five checks for security plus one check for performance:

eg:
service limits
IAM use
security groups-unrestricted ports
MFA on root account
Elastic block storage public snapshot
RDS public snapshot.

AWS Inspector

AWS Inspector provides for the automation of security assessments. The assessments can be set to run on a schedule or when an event occurs that is monitored by Amazon CloudWatch, or also via an API call. The dashboard shows the assessments, as well as the findings from the various scans that have run.

Amazon Inspector makes use of assessment templates that define which sets of rules you want to run against your environment.

Two types of assessments are offered by AWS Inspector: network assessments and host assessments.

Network assessments don’t require any agent to be installed. However if you want detailed information about processes running on a specific port then you need to install the AWS Inspector Agent.

Host assessments however require the Inspector Agent to be installed. These assessments are far more detailed and scan for things such as vulnerable versions of software, violations of security best practices, and areas that should be system hardened. You can select these assessments set up AWS Inspector.

You create an assessment template in Inspector which you then use to assess your environment by means of an Assessment Run which will then report on its findings.

Templates contain one or more rules packages. A rules package defines what you are checking for. Note that you can’t create custom rules packages; you can use only the rules packages provided by AWS. Currently, these are the rules packages available, listed by assessment type:

Network assessments

Network Reachability: This rules package checks your environment’s network configurations, including your security groups, network access control lists (NACLs), route tables, subnets, virtual private cloud (VPC), VPC peering, AWS Direct Connect and virtual private gateways (VPGs), Internet gateways (IGW), EC2 instances, elastic load balancers (ELBs), and elastic network interfaces (ENIs).

Host assessments

Common Vulnerabilities and Exposures (CVE): This rules package checks your systems to see if they are vulnerable to any of the CVEs reported.

Center for Internet Security (CIS) Benchmarks: This rules package assesses your systems against CIS benchmarks specific to your OS.

There are Level 1 and Level 2 checks. Level 1 is usually safe to implement; Level 2 is more risky as the settings in Level 2 may have unintended side effects. Level 2 is usually used in environments where a very high level of security is required.

Security Best Practices: This rules package assesses how well your environment confirms to security best practices. Eg, it will check that a Linux EC2 instance cannot be logged into via SSH.

Runtime Behavior Analysis: This rules package identifies risky behaviors on your systems, such as using insecure protocols for connecting or open ports that are not in use.

AWS GuardDuty

GuardDuty is the AWS intrusion detection system (IDS) or intrusion prevention system (IPS). It uses threat intelligence feeds and analyzes logs from multiple sources, such as VPC flow logs, AWS CloudTrail event logs, and DNS logs.

GuardDuty can alert you to suspicious activity that could indicate potential issues such as leaked user account credentials, privilege escalation attacks, and possible command-and-control type activities.

GuardDuty scans specifically for three types of activity:

Reconnaissance
Instance compromise
Account compromise

Reconnaissance is the first step of an attack and was defined in the “Cyber Kill Chain”, developed by Lockheed Martin. During the reconnaissance
phase, an attacker is learning about your environment through actions such as vulnerability scans to probe for IP addresses, hostnames, open ports, and misconfigured protocols.

GuardDuty can detect also utilize threat intelligence feeds to detect IP addresses known to be malicious. You can use findings reported by GuardDuty to automatically remediate the vulnerability become it develops into a security violation.

The next type of activity is instance compromise. This consists of several indicators that may be present, such as malware command and control, crypto miners, unusual traffic levels or unusual network protocols, or communication with a known malicious IP.

AWS CloudWatch Monitoring Overview

By editor on September 16, 2022 in

AWS CloudWatch is the basic AWS monitoring service that collects metrics on your resources in AWS, including your applications, in real time.

You can also collect and monitor log files with AWS CloudWatch. You can set alarms for metrics in CloudWatch to continuously monitor performance, utilization, health, and other parameters of your AWS resources and take action when metrics cross set thresholds.

CloudWatch is a global AWS service, so it can monitor resources and services across all AWS regions via a single dashboard.

CloudWatch provides basic monitoring free of charge at 5-minute intervals as a serverless AWS service, thus there is no need to install any additional software to use it.

For an additional charge, you can set detailed monitoring that provides data at 1-minute intervals.

AWS CloudWatch has a feature that allows you to publish and retain custom metrics for a 1-second or 1-minute duration for your application, services, and resources, known as high-resolution custom metrics.

CloudWatch stores metrics data for 15 months, so even after terminatíng an EC2 instance or deleting an ELB, you can still retrieve historical metrics for these resources.

How CloudWatch Works

CW Monitoring Is Event-Driven

All monitoring in AWS is event-driven. An event is “something that happens in AWS and is captured.”

For example, when a new EBS volume is created, the createVolume event is triggered, with a result of either available or failed. This event and its result are sent to CloudWatch.

You can create a maximum of 5000 alarms in every region in your AWS account.

You can create alarms for functions such as starting, stopping, terminating, or recovering an EC2 instance, or when an instance is experiencing a service issue.

Monitoring Is Customizable

You can define custom metrics easily. A custom metric behaves just like a predefined one and can then be analyzed and interpreted in the same way as standard metrics.

One important limitation of CloudWatch – exam question!

CloudWatch functions below the AWS Hypervisor, which means it functions below the virtualization layer of AWS.

This means it can report on things like CPU usage and disk I/O…but it cannot see beyond what is happening *above* that layer.

This means CloudWatch CANNOT tell you what tasks or application processes are affecting performance. Remember this point!

Thus it cannot tell you about disk usage, unless you write code that checks disk usage and send that as a custom metric to CloudWatch.

This is an important aspect that can appear in the exam. You might be asked if CloudWatch can report on memory or disk usage by default; it cannot.

Monitoring Drives Action

The final piece of the AWS monitoring puzzle is alarms – this is what occurs after a metric has reported a value or result outside a set “everything is okay” threshold.

When this happens, an alarm is triggered. Note that an alarm is not necessarily the same as “something is wrong”; an alarm is merely a notification that something has happened at a particular point.

For example, it could be running some code in Lambda, or sending a message to an Auto Scaling group telling it to scale in, or sending an email via the AWS SNS message service.

Think of alarms as saving you from having to sit monitoring the CloudWatch dashboard 24×7.

One of your tasks as SysOp is to define these alarms.

CloudWatch Is Metric- and Event-Based

Know the difference between metrics and events.

An event is predefined and is something that happens, such as bytes coming into a network interface.

The metric is a measure of that event eg how many bytes are received in a given period of time.

Events and metrics are related, but they are not the same thing.

CloudWatch Events Are Lower Level

An event is something that happens, usually a metric changing or reporting to CloudWatch, but at a system level.

An event can then trigger further action, just as an alarm can.

Events are typically reported constantly from low-level AWS resources to CloudWatch.

CloudWatch Events Have Three Components

CloudWatch Events have three key components: events, rules, and targets.

An event:

the thing being reported. Events describe change in your AWS resources. They can be thought of as event logs for services, applications and resources.

A rule:

an expression that matches incoming events. If the rule matches an event, then the event is forwarded to a target for processing.

A target:

is another AWS component, for example, a piece of Lambda code, or an Auto Scaling group, or an email or SNS/SQS message that is sent out.

Both alarms and events are important and it is essential to monitor both.

CloudWatch Namespaces

A CloudWatch Namespace is a container for a collection of related CloudWatch metrics. This provides for a way to group metrics together for easier understanding and recognition.

AWS provides a number of predefined namespaces, which all begin with AWS/[service].

Eg, AWS/EC2/CPUUtilization is CPU utilization for an EC2 instance,

AWS/DynamoDB/CPUUtilization is the same metric but for DynamoDB.

You can add your own custom metrics to existing AWS namespaces, or else create your own custom namespaces in CloudWatch.

exam question:

CloudWatch can accept metric data from 2 weeks earlier and 2 hours into the future but make sure your EC2 instance time is set accurately for this to work correctly!

Monitoring EC2 Instances

CloudWatch provides some important often-encountered metrics for EC2.

Here are some of the most common EC2 metrics which you should be familiar with for the exam:

CPUUtilization – one of the fundamental EC2 instance metrics. It shows the percentage of allocated compute units currently in use.

DiskReadOps – reports a count of completed read operations from all instance store volumes.

DiskWriteOps – opposite of DiskReadOps, reports a count of completed read operations from all instance store volumes.

DiskReadBytes – reports the bytes read from all available instance store volumes.

DiskWriteBytes – reports the total of all bytes written to instance store volumes.

NetworkIn – total bytes received by all network interfaces.

NetworkOut – total bytes sent out across all network interfaces on the instance.

NetworkPacketsIn – total number of packets received by all network interfaces on the instance (available only for basic monitoring).

NetworkPacketsOut – number of packets sent out across all network interfaces on the instance. Also available only for basic monitoring.

S3 Metrics

There are many S3 metrics, but these are the most common ones you should know:

BucketSizeBytes – shows the daily storage of your buckets as bytes.

NumberOfObjects – the total number of objects stored in a bucket, across all storage classes.

AllRequests – the total number of all HTTP requests made to a bucket.

GetRequests – total number of GET requests to a bucket. There are also similar metrics for other requests: PutRequests , DeleteRequests , HeadRequests , PostRequests , and SelectRequests.

BytesDownloaded – total bytes downloaded for requests to a bucket.

BytesUploaded – total bytes uploaded to a bucket. These are the bytes that contain a request body.

FirstByteLatency – per-request time for a completed request, by first-byte millisecond.

TotalRequestLatency – the elapsed time in milliseconds from the first to the last byte of a request.

CloudWatch Alarms

Alarms Indicate a Notifiable Change

A CloudWatch alarm initiates action. You can set an alarm for when a metric is reported with a value outside of a set level.

Eg, for when your EC2 instance CPU utilization reaches 85 percent.

Alarms have three possible states at any given point in time:

OK : means the metric lies within the defined threshold.

ALARM : means the metric is below or above the defined threshold.

INSUFFICIENT_DATA : can have a number of reasons. The most common reasons are that the alarm has only just started or been created, that the metric it is monitoring is not available for some reason, or there is not enough data at this time to determine whether the alarm is OK or in ALARM state.

CloudWatch Logs

CloudWatch Logs stores logs from AWS systems and resources and can also handle the logs for on-premises systems provided they have the Amazon Unified CloudWatch Agent installed.

If you are monitoring AWS CloudTrail activity through CloudWatch, then that activity is sent to CloudWatch Logs.

If you need a long retention period for your logs, then CloudWatch Logs can also do this.

By default logs are kept forever and never expire. But you can adjust this based on your own retention policies.

You can choose to keep logs for only a single day or go up to 10 years.

Log Groups and Log Streams

You can group logs together that serve a similar purpose or from a similar resource type. For
example, EC2 instances that handle web traffic.

Log streams refer to data from instances within applications or log files or containers.4

CloudWatch Logs can send logs to S3, Kinesis Data Streams and Kinesis Data Firehose, Lambda and ElasticSearch

CloudWatch Logs – sources can be:

SDKs,

CloudWatch Logs Agent,

CloudWatch Unified Agent

Elastic Beanstalk

ECS – Elastic Container Service

Lambda function logs

VPC Flow Logs – these are VPC specific

API Gateway

CloudTrail based on filters

Route53 – logs DNS queries

Define Metric Filters and Insights for CloudWatch Logs

You can apply a filter expression eg to look for a specific IP in a log or the number of occurrences of “ERROR” in the log

Metric filters can be used to trigger CloudWatch Alarms

CloudWatch Logs Insights can be used to query logs and add queries to CloudWatch Dashboards

CloudWatch Logs – Exporting to S3

NOTE

this can take 12 hours for the data to become available for export – so it is not real time. For this you should use Log Subscriptions.

The API call for this is “CreateExportTask”

CloudWatch Log Subscriptions

You apply a “subscription filter” to the CloudWatch Log before sending it to eg a Lambda function managed by AWS/or to a custom-designed Lambda function and then from there as real-time data on to eg ElasticSearch. Or, you might send it from Subscription Filter and then to Kinesis.

You can also send or aggregate logs from different accounts and different regions to a subscription filter in each region and from there to a common single Kinesis Data Stream and Firehose and from there in near-real time on to eg S3.

Unified CloudWatch Agent

The AWS Unified CloudWatch Agent provides more detailed information than the standard free CloudWatch service.

You can also use it to gather logs from your on-premises servers in the case of a hybrid environment and then centrally manage and store them from within the CloudWatch console.

The agent is available for Windows and Linux operating systems.

When installed on a Windows machine, you can forward in-depth information to CloudWatch from the Windows Performance Monitor, which is built into the Windows operating system.

When CloudWatch is installed on a Linux system, you can receive more in-depth metrics about CPU, memory, network, processes, and swap memory usage. You can also gather custom logs from applications installed on servers.

To install the CloudWatch agent, you need to set up the configuration file.

AWS NACLs – Network Access Control Lists

By editor on September 14, 2022 in

The AWS Network Access Control List (NACL) is a security layer for your VPC that acts as a firewall for controlling traffic in and out of one or more subnets.

NACLs vs. Security Groups

NACLs and Security Groups (SGs) both have similar purposes. They filter traffic according to rules, to ensure only authorized traffic is routed to its destination.

NACLs

NACLs are used to control access to network resources. They reside on subnets and evaluate traffic based on defined rules which you set, and use these rules to determine whether or not traffic should be allowed to pass through the subnet.

NACLs are “STATELESS” which means they require you to create separate rules for BOTH INCOMING AND OUTGOING traffic. Just because a particular data stream is allowed into the subnet, this doesn’t mean it will automatically be allowed out.

NACLs are processed in numerical ie serial order. Thus if you want traffic to be permitted both in and out of a subnet, you have to set network access rules for both directions.

NACLs are automatically applied to everything within that subnet, so there is no need to apply NACLs to individual resources as they are created. This means less network admin overhead for managers.

Security Groups

Security Groups apply to EC2 instances and operate like a host-based firewall. As with NACLs they apply rules that determine whether traffic to or from a given EC2 instance should be allowed.

This provides for more finely tuned traffic control for resources that have specific network traffic requirements.

Security Groups unlike NACLs are stateful; this means that any traffic that is allowed into your EC2 instance will automatically be allowed out again and vice versa.

All security groups rules are evaluated according to a default “deny everything unless allowed” policy. This means that if no ALLOW exists, then traffic will be blocked.

Security Groups must be applied at the time of resource creation and have to be explicitly configured.

Similarities and Differences Between NACLs and Security Groups

Both NACLs and Security Groups utilize rules that prevent unwanted traffic from accessing your network. The rules themselves also look similar. But a notable difference between them is that NACLs allow for DENY rules to be explicitly created.

It is important to ensure that your security group rules and your NACLs are not working against one another. Thus it is important to understand when it is best to use NACLs and when it is best to use SGs.

The major difference between them is in where they are applied. NACLs are applied at the SUBNET level, while Security Groups are applied at the EC2 instance level.

NACLs protect the network while Security Groups protect the resource.

As NACLs are higher up in the architecture, they apply to a much wider set of resources. Any NACL rule you create will therefore impact the operation of every resource located within the subnet.

Security Groups on the other hand only affect the EC2 instances to which they are attached.

When to Use NACLs

NACLs are best used sparingly. Because NACLs apply to the full set of resources in a subnet, their impact is wide and substantial.

NACLs are most effective for filtering external traffic to internal subnets. They can also be useful for applying traffic controls between the subnets themselves.

Best Practices for Using NACLs

Use NACLs sparingly and deploy them based on the function of the subnet they are attached to

Keep NACLs simple and only use them to deny traffic if possible

Restrict who can create or modify NACLs through IAM rules

Build your Security Group rules into your NACLs

Ensure that your inbound and outbound rules make sense ie that they match

When numbering your NACLs, be sure to leave room for future rules

Audit your rules frequently and delete any rules that are unused or redundant

Deploy NACLs to also control your subnet-to-subnet traffic and ensure logical separation between them

NACLS – Essential Points To Remember For Exam

One NACL per subnet

New subnets always get assigned to a default NACL – this ALLOWS all traffic in and out by default!

BUT – newly created NACLS DENY by default! – if last rule is an * (asterisk)

Rules are numbered from 1 to 32766 – and LOWEST numbers have the HIGHEST priority!

Number your rules in 100s steps for ease of admin.

So if a rule numbered 100 allows and another rule numbered 200 denies for same traffic, then the rule nunber 100 will win – ie the traffic will be allowed.

Remember the basic essential differences between NACLs and SGs

SGs:

operate at EC2 instance level

support allow rules only

are STATEFUL – which means return traffic is ALWAYS automatically allowed, regardless of rules

ALL RULES ARE EVALUATED BEFORE DECIDING WHETHER TO ALLOW TRAFFIC

Applies to an EC2 instance when specified

NACLS:

operate at SUBNET level

support both ALLOW AND DENY rules

are STATELESS – which means return traffic has to be explicitly allowed by setting appropriate NACL rules – using ephemeral ports

Rules are evaluated in order from LOWEST to HIGHEST, lowest first match wins

Automatically apply to all EC2s in the respective subnet.

Reachability Analyzer

This is very useful in debugging any SG or NACL traffic problems.

AWS – Choosing an AWS Database

By Admin on August 28, 2022 in

AWS offers several db solutions.

A number of questions need to be considered when choosing an AWS DB solution:

Questions to Consider

What is your need – read heavy, write heavy, balanced?

Throughput volume?

Will this change, does it need to scale, will it fluctuate up and down?, is it predictable or not?

How much data volume to store – and for how long

will it grow?

what is the average object size?

how do you need to access it?

what data durability do you require? what is the “source of truth” for the data?

are there compliance requirements?

what latency requirements – how many concurrent users

what is the data model – how will you query the data, will there be data joins? structured, semi-structured?

strong schema, flexible, reporting needs? searching? RBDMS or NoSQL?

what about license costs – cloud native db such as Aurora possible or not?

Overview of Database Types on AWS

RDBMS such as sql, oltp, this means: RDS, or Aurora, esp good for joins

NoSQL DBs such as DynamoDB – json, elasticache – key/value pairs or Nepture – good for graphs, but no joins and no sql

Object Stores: S3 for big objects, Glacier for backup and archives – may not seem like a DB but it works like one

Data Warehouse solutions eg SQL Analytics/BI, Redshift (OLAP), Athena

Search solutions: eg ElasticSearch (json), for free text unstructureed searches

Graph solutions: Neptune – this displays relationships between data

Overviews of AWS DB Solutions

RDS Overview

its a managed db on the postgresql/myswl/Oracle/SQL level

you must however an ec2 instance and ebs vol type and sufficient size

it supports read replicas and multi-AZ
security is via iam and security groups, kms, and ssl in transit
backup, snapshot and point in time restores all possible

managed and scheduled maintanance

monitoring available via cloudwatch

use cases include:

storing relational datasets rdbms/oltp performing sql queries, transactional inserts, update, delete is possible

rds for solutions architect, considerations include these “5 pillars”:

operations
security
reliability
performance
cost

operations_ small downtimes when failover happens, when maintenance happens, when scaling read replicas, ec2 instances, and restoring from ebs, this requires manual intervention, and when application changes

security: aws is responsible for os security, but we are responsible for setting up kms, security groups, iam policies, authorizing users in db and using ssl

reliability: the multi-az feature makes rds v reliable, good for failover in failure situations

performance: dependent on ec2 instance type, ebs vol type, can add read replicas, storage autoscaling is possible, and manual scaling of instances is also possible

costs: is pay per hour based on provisioned number and type of ec2 instances and ebs usage

Aurora Overview

Compatible API for PostgreSQL and MySQL

Data is held in 6 replicas across 3 AZs
It has auto-healing capability
Multi-AZ Auto Scaling Read Replicas
Read Replicas can be global

Aurora DB can be global for DR or latency purposes
auto Scaling of storage from 10GB to 128 TB

Define EC2 instance type for Aurora instances

same security/monitoring/maintenance as for RDS but also has

Aurora Serverless for unpredictable/intermittent workloads
Aurora Multi-Master for continuous writes failover

use cases: same as for RDS but with less maintenance/more flexibility and higher performance

Operations: less operations, auto scaling storage

security: AWS as per usual, but we are responsible for kms, security groups, iam policy, user auth and ssl

reliability: multi AZ, high availability, serverless and multimaster options

performance 5x performance, plus max 15 read replicas (vs only 5 for rds)

cost: pay per hour acc to ec2 and storage usage, can be lower cost than eg oracle

ElastiCache Overview

it is really just a cache not a database

is a managed Redis or Memcached – similar to RDS but for caches

its an in-memory data store with very low latency
you must provision an ec2 instance type to use it

supports redis clustering and multi-az, plus read replicas using sharding

security is via iam, security groups, kms, and redis auth – NO IAM

backup, snapshot, point in time restores
managed and scheduled maintenance
monitoring via cloudwatch

use case: key/value store, frequent reads, less writes, cache results for db queries, storing of session data for websites, but cannot use sql – latter is important to be aware of.

you retrieve the data only by key value, you can’t query on other fields

operations: same as rds
security: usual can use iam policies,for users Redis Auth, and ssl

reliability: clustering and multi AZ

performance: in memory so extremely fast, read replicas for sharding, very efficient

cost: similar to rds pricing based on ec2 and storage usage

DynamoDB

proprietary to AWS
a managed NoSQL DB
serverless, provisioned, auto-scaling, on-demand capacity

can replace elasticache as a key-value store, eg for storing session data

performance is slower than eg rds

highly available, multi-AZ by default, read–writes decoupled, DAW available for read cache

2 options for reads: eventually consistent or strongly consistent

security, authorization-authentication all done via iam

dynamodb streams integrate with lambda

backup-restore and global table feature – if you enable streams

monitoring via cloudwatch

important
but you can only query on primary key, sort key or indexes – exam q!
so you cannot query on “any” attribute – only the above.

use case:

serverless apps development, small docs 100s kb, distrib serverless cache, doesn’t have sql query language available, has transaction capability now built in

S3 Overview

acts as a simple key/value store for objects

great for big objects up to 5TB, not so good for small objects

serverless, scales infinitely, strong consistency for every operation

tiers for migrating data: s3 standard, s3 IA, s3 one-zone IA, Glacier for backups

features include: versioning, encryption, CRR – cross-region replication

security: IAM, bucket policies, ACL

encryption: SSE-S3, SSE-KMS, SSE-C, client side encryption, SSL in transit

use case: static files, key-value stores for big files and website hosting

operations: no operations necessary!
security: IAM, bucket policies, ACL, encryption set up correctly,

reliability: extremely high, durability also extremely good, multi-AZ and CRR

performance: scales easily, very high read/write, multipart for uploads

cost: only pay for storage used, no need to provision in advance, plus network costs for transfer/retrieval, plus charge per number of requests

Athena Overview

fully serverless database with SQL capability
used to query data in S3
pay per query
outputs results back to S3
is secured via IAM

use cases: one-time sql queries, serverless queries on S3, log analytics

Operations: no ops needed! is serverless

security: via S3 using bucket policies, IAM

reliability: is managed, uses Presto engine, highly available
performance: queries scale based on data size

cost: pay per query/per TB of data scanned, serverless

Redshift Overview

is a great data warehouse system

based on PostgreSQL but not used for oltp

instead, its an OLAP – online analytical processing — analytics and data warehousing

10x better performance than other data warehouses, scales to PBs of data

columnar storage of data rather than row-based

it is MPP – uses a massively parallel query execution engine which makes it extremely fast

pay as you go acc to no of instances provisioned
has an SQL interface for performing the queries.

data is loaded from S3, DynamoDB, DMS and other DBs,
1 to 128 nodes possible with up to 128 TB of space per node!

leader node used for query planning and results aggregation
compute node : performs the queries and sends results to leader node

Redshift Spectrum: performs queries directly against S3 with no load to load the data into the redshift cluster

Backup & Restore, SecurityVPC, IAM, KMS, monitoring

Redshift Enhanced VPC Routing: Copy / Unload goes through VPC, this avoids public internet

Redshift Snapshots and DR

has no multi-AZ node

the snapshots are p-i-t- point in time backups of a cluster, stored internally in S3

they are incremental — only changes are saved, makes it fast

can restore to a new cluster

automated snapshots: every 8 hrs every 5gb or according to a schedule, with a set retention period

manual snapshots: snapshot is retained until you delete it

neat feature:
you can configure Redshift to auto copy snapshots to a cluster of another region
either manually or automatically

loading data into redshift:

there are three possible ways:

1. use kinesis data firehouse loads data into redshift cluster via an s3 copy automatically to an S3 bucket

2. using the copy command manually without Kinesis
from S3 bucket via internet – ie without using enhanced vpc routing to redshift cluster
or via vpc not via internet, using enhanced vpc routing

3. from ec2 instance to redshift cluster using jdbc driver
in this case it is much better to write the data in batches rather than all at once.

Redshift Spectrum

must already have a redshift cluster operational

Spectrum is a way to query data that is already in S3 without having to load it into Redshift

you submit the query which is submitted to thousands of Redshift spectrum nodes for processing

Operations: similar to rds
security: uses iam, vpc, kms, ssl as for rds
reliability: auto healing, cross region snapshop copies possible
performance: 10x performance of other data warehousing systems, uses compression
cost: pay per node provisioned about 10% of cost of other dw systems
vs athena: is faster at querying, does joins, aggregations thanks to indexes

redshift = analytics/BI/Data Warehouse

Glue Overview

is a managed extract transform and load ETL service

used to prepare and transform data for analytics
fully serverless service

fetch data from s3 bucket or rds, send to glue which does extracting, transforming and loading to redshift data warehouse

glue data catalog: catalog of all the datasets you have in aws – ie metadata info about your data

you deploy glue data crawler to navigate s3, rds, dynamo DB, it then writes the metadata to the glue data catalog.

this can then be used b Glue Jobs eTL

data discovery -> Athena, Redshift Spectrum, EMR for analytics

all you need to know about Glue for exam

Finally,

Neptune Overview

tip:
if you hear graphs mentioned in exam, this refers to neptune!

is a fully managed graph database

used for
high relationship data
social networking, eg users friends with users, replied to comment on post of user and likes other comments

knowledge graphs eg for wikipedia – links to other wiki pages, lots of these links
this is graph data – giving a massive graph

highly available across 3 AZs with up to 15 read replicas available

point in time recovery, continuous backup to S3

support for kms encryption, https, and ssl

operations: similar to rds
security: iam, vpc, kms, ssl – similar to rds, plus iam authentication
reliability: multi-AZ and clustering
performance: best suited for graphs, clustering to improve performance

cost: similar to rds, pay per provisioned node

remember neptune = graphs database

AWS OpenSearch

– the successor to AWS ElasticSearch

eg dynamodb only allows search by primary key or index,

whereas with OpenSearch you can search ANY field

often used as a complement to other db

also has usage for big data apps
can provision a cluster of instances
built-in integrations: various – kinesis data firehose, aws IoT, CloudWatch Logs

comes with visualization dashboard

operations: similar to rds

security: is via Cognito, IAM, KMS encryption, SSL and VPC
reliability: multi-AZ and clustering
performance: based on elasticsearch project – open source, pbyte scale
cost: pay per provisioned node

all you need to remember:

used to search and index data

Question 1:
Which database helps you store relational datasets, with SQL language compatibility and the capability of processing transactions such as insert, update, and delete?

RDS

Question 2:
Which AWS service provides you with caching capability that is compatible with Redis API?

ElastiCache

Good job!
Amazon ElastiCache is a fully managed in-memory data store, compatible with Redis or Memcached.

Question 3:
You want to migrate an on-premises MongoDB NoSQL database to AWS. You don’t want to manage any database servers, so you want to use a managed NoSQL database, preferably Serverless, that provides you with high availability, durability, and reliability. Which database should you choose?

Amazon DynamoDB

Good job!

Amazon DynamoDB is a key-value, document, NoSQL database.

Question 4:
You are looking to perform Online Transaction Processing (OLTP). You would like to use a database that has built-in auto-scaling capabilities and provides you with the maximum number of replicas for its underlying storage. What AWS service do you recommend?

Amazon Aurora

Good job!

Amazon Aurora is a MySQL and PostgreSQL-compatible relational database. It features a distributed, fault-tolerant, self-healing storage system that auto-scales up to 128TB per database instance. It delivers high performance and availability with up to 15 low-latency read replicas, point-in-time recovery, continuous backup to Amazon S3, and replication across 3 AZs.

Question 5:
As a Solutions Architect, a startup company asked you for help as they are working on an architecture for a social media website where users can be friends with each other, and like each other’s posts. The company plan on performing some complicated queries such as “What are the number of likes on the posts that have been posted by the friends of Mike?”. Which database do you recommend?

Amazon Neptune

Good job!
Amazon Neptune is a fast, reliable, fully-managed graph database service that makes it easy to build and run applications that work with highly connected datasets.

Question 6:
You have a set of files, 100MB each, that you want to store in a reliable and durable key-value store. Which AWS service do you recommend?

Amazon S3

Good job!
Amazon S3 is indeed a key-value store! (where the key is the full path of the object in the bucket)

Question 7:
You would like to have a database that is efficient at performing analytical queries on large sets of columnar data. You would like to connect to this Data Warehouse using a reporting and dashboard tool such as Amazon QuickSight. Which AWS technology do you recommend?

Amazon Redshift

Good job!
Amazon Redshift

Question 8:
You have a lot of log files stored in an S3 bucket that you want to perform a quick analysis, if possible Serverless, to filter the logs and find users that attempted to make an unauthorized action. Which AWS service allows you to do so?

Amazon Athena

Good job!
Amazon Athena is an interactive serverless query service that makes it easy to analyze data in S3 buckets using Standard SQL.

Question 9:
As a Solutions Architect, you have been instructed you to prepare a disaster recovery plan for a Redshift cluster. What should you do?

enable automated snapshops, then configure your redshift cluster to autocopy the snapshots to another aws region

Good job!

Question 10:
Which feature in Redshift forces all COPY and UNLOAD traffic moving between your cluster and data repositories through your VPCs?

Enhanced VPC Routing

Good job!

Question 11:
You are running a gaming website that is using DynamoDB as its data store. Users have been asking for a search feature to find other gamers by name, with partial matches if possible. Which AWS technology do you recommend to implement this feature?

ElasticSearch

Good job!
Anytime you see “search”, think ElasticSearch.

Question 12:
An AWS service allows you to create, run, and monitor ETL (extract, transform, and load) jobs in a few clicks.

AWS Glue

Good job!
AWS Glue is a serverless data-preparation service for extract, transform, and load (ETL) operations.

AWS DynamoDB

By Admin on August 26, 2022 in

DynamoDB is a fully managed, highly-available database with replication across multiple AZs

NoSQL – not a relational database! – just simple key:value

single digit millisecond performance.

scales to massive workloads

100TBs of storage

fast and consistence, low latency retrieval

integrated with iam for security authorization and admin
enables event-driven programming via DynamoDB Streams

low cost and auto-scaling

important for exam:

no db provisioning needed you just create a table

Has two able classes: standard and infrequent access IA table class

Basics of DynamoDB

v important – also exam q:
made up of tables – there is NO SUCH THING AS A DATABASE – you only need to create tables!
the db already exists!

you just create tables –

each table has a primary key – must be set at creation time
each table can have an infinite number of items ie rows

each item has attributes can be added over time or they can be null

this is much easier to do at any time than with a conventional relational db.

max size of an item is 400KB so not good for large objects

data types supported are:

scalar types: string, number, binary, boolean, null
document types: list, map
set types: string set, number set, binary set

tables:

have primary key – containing partition key (eg user_id) and sort key (eg game_id)
and
attributes: eg score, result – these are items which are NOT the primary key

Is a great choice for a schema where you need to rapidly evolve the schema, better than the others for this.

read/write capacity modes

control how you manage your table capacity – ie the read and write throughput

2 Capacity Modes for DynamoDB

important for exam:

Provisioned mode – is default, provisioned in advance

we have 2 possible capacity modes:

provisioned mode, where you specify number of read and writes per second, this is the default.

very good for predictable modes.

You have to plan the capacity for this in advance, you pay for provisioned read capacity units RCUs and write capacity units (WCUs)

these “capacity units” are used to set your desired capacity for your database! – set these in the web dashboard of dynamodb when you are provisioning.

you can also add auto scaling mode for both rcu and wcu

and

on-demand mode – much more expensive, but better for unpredictable workloads or sudden transaction spikes where provisioned mode cant scale sufficiently

on-demand mode:

read write automatically scales up and down acc to workload, so no capacity planning needed
you pay for your use, is more expensive 2-3x more

good for unpredictable workloads which can be large or small varying

need to know for exam!

remember always you just create tables, never a database with dynamodb!

you can specify read and write capacity autoscaling separately

DynamoDB Accelerator DAX

a fully manageable, highly available seamless in-memory cache for dynamodb

helps solve read congestion by means of memory caching

microseconds latency for cached data

does not require any application logic modification – so it is fully compatible with existing dynamodb api

applications can access the database via DAX for much faster reads

TTL is 5 mins default for dax data

dax is different from elasticache in that it is meant for dynamodb and does not change the api
good for individual object cache

elasticache
good for storing aggregation result where further processing required

DynamoDB Stream

very easy – is an ordered stream of data that represents items when created or updated or deleted

can then be sent on to kinesis datastreams
can be read by lambda or kinesis client library apps

data is retained for 24 hrs

use cases:

reacting to changes in real time eg welcome emails to new users
to do analytics
to insert into derivative tables or into elastic serarch
or implement cross region replication

Summary of DynamoDB Streams

So, to summarize DynamoDB Streams

Application -> create/update/delete actions to the TABLE -> Dynamo DB Streams

and from the TABLE -> Kinesis Data Streams -> Kinesis Data Firehose

Kinesis Data Firehose

-> for analytics -> Redshift
-> for archiving -> S3
-> for indexing -> OpenSearch

DynamoDB Global Tables

are cross-region

two way or multi-way replication

to provide for low latency across regions

it is
active-active replication

this means apps can read and write to the table from any region

must enable dynamodb streams for this

TTL expiry: automatically deletes items after an expiry timestamp

eg 1 month for each row in the table

DynamoDB Indexes – only need to know at high level for exam

basically, these allow you to query on attributes other than the primary key

2 types:

gsi – global secondary and lsi – local secondary

(all you need to know for now)

by default you query on the primary key, but you can use gsi or lsi to query on other attributes.

Transactions in DynamoDB

these allow you to write to 2 tables at the same time:

the transaction however MUST write to both or else none – in order to keep the tables accurate – so data consistency is maintained

AWS Lambda

By Admin on August 26, 2022 in

Serverless Services in AWS include:

Lambda
DynamoDB
Cognito
API Gateway
S3
SNS/SQS
Kinesis Data Firehose
Aurora Serverless
Step Functions
Fargate

exam tests heavily on serverless knowledge!

AWS Lambda

features:

virtual functions – server to manage
limited by time – short execution processes
runs on demand only, only billed when you are actually using it
the scaling is automated

the benefits of Lambda:

easy pricing – pay per request and compute time

free tier covers 1 million Lambda requests and 400k of GB compute time

integrated with all AWS services and programming languages
easy monitoring via CloudWatch
easy to allocate more resources per function –
up to 10GB of RAM! possible

also, increasing RAM improves CPU and network

Lambda language support:

node.js – javascript
python
java 8
c .net core
golang
c powershell
ruby
custom runtime api eg rust

the Lambda container image — this must implement the Lambda runtime api

note that ecs and fargate are preferred for running arbitrary docker images

Lambda integrates with

api gateway
kinesis
dynamodb
s3

cloudfront
cloudwatch events and eventbridge

cloudwatch logs
sns and sqs
cognito – reacts when a user logs in eg to a database

REMEMBER:

Lambda’s maximum execution time is 15 minutes. If you need longer, you can run your code somewhere else such as an EC2 instance or use Amazon ECS.

Lambda use case:

thumbnail image creation

new image uploaded to s3 then triggers a Lambda function to generate a thumbnail of the image
this is pushed to s3 and meta data to dynamo db.

another example:

a very useful practical example….

a serverless CRON job to run jobs

but for cron you usually need to have a server running, but with Lambda you can do this without a server! – this saves having to implement an EC2 instance for this.

eg cloudwatch events or eventbridge every hour triggers a Lambda function, this is instead of the cronjob!

Lambda Pricing

pay per calls first 1mill requests are free

then 20c per 1 mill requests

pay per duration in increments of 1 ms

400k GBseconds of compute time per month is free, charges thereafter on rising scale

very cheap to run Lambda so it is very popular

you can run jobs using many different program languages

you enter your code in Lambda web console and Lambda then runs the code for you.

you can have Lambda respond to events from various sources – eg data processing, streaming analytics, mobile or iot backends

Lambda takes care of scaling for your load, you don’t have to do anything here!
ie seamless scaling

to create a Lambda function you have 4 possibilities:

author from scratch
use a blueprint – these are pre-configured functions
container image
browse serverless app repository

Lambda Limits per region

important for exam…

for execution:

mem allocation 128 mb to 10 gb in 1mb increments

max exec time is 900 secs

env variables 4kb

disk capacity in the function container in /tmp is 512 mb

concurrency executions 1000 – can be increased

for deployment:

function deployment size compressed .zip is 50mb but size of uncompressed deployment code plus dependencies is 250mb

can use the /tmp to load other files at startup

size of env variables is 4kb

the exam may ask you question to see if you think Lambda can be used or not acc to the requirement for the task… you need to know these above limits in order to judge suitability of Lambda for the task.

Lambda@Edge

if you are deploying a CloudFront cdn and you want to deploy Lambda globally

how to implement request filtering

you can use Lambda@edge for this

you deploy it alongside each region in your cloudfront cdn

you can use Lambda to modify the viewer/origin requests and responses of cloudfront:

this can be:

after cloud front receives a request – viewer request
before cloud front forwards the request to the origin – origin request

after cloudfront receives the response from the origin – origin response
before cloudfront forwards the response to the viewer – viewer response

plus, you can also generate responses to viewers without having to send a request to the origin!

important to know this high level overview for exam.

use cases:

website security/privacy

dynamic web application at the Edge

SEO

intelligent routing across origins and data centers

bot mitigation at the Edge

real-time image transformation
a/b testing
user authentication and authorization

user prioritization
user tracking and analytics

Lambda in VPC

important!
by default Lambda functions are launched in an internal AWS VPC – not in one of your own VPCs.

an important consequence of that is that resources in your own VPC CANNOT BE ACCESSED! – exam q!

If you want that functionality, then you have to launch Lambda in your own VPC…

this requires

you define the VPC ID,, subnets and security groups

Lambda will create an ENI – Elastic Network Interface in your subnets..

this gives private connectivity in your own VPC.

a typical use case for this is using Lambda with an RDS Proxy.

but – this can open a very large no of connections under high loads on your database leading to timeouts and other problems

RDS Proxy for Lambda

to avoid this you can create an RDS Proxy, Lambda functions then connect to the proxy and then to your RDS DB.

improves scalability and availability

you can enforce iam authentication and store credentials in secrets manager

remember though the rds proxy is NEVER PUBLICLY accessible,, only private,

and so the Lambda function must therefore to use this proxy always be deployed in your own aws VPC and not in the AWS own VPC.

AWS S3 Storage

By Admin on August 22, 2022 in

S3 is the AWS object storage system. S3 stores objects in flat volumes or containers called buckets, rather than a hierarchical file system. There are no file directories as such in S3!

Buckets must have globally unique name – across ALL accounts of AWS – not just your own!

defined at region level – important!

bucket naming convention – must know this for exam!

no uppercase, no underscore, 3-63 chars long,

must start with lowercase letter or number

must not be an ip address

objects:

are files, must have a key – the full path to the file

s3://my-bucket/my_file.txt -> my_file.txt is the key

you can add “folder” names but they are just a prefix ie tag not a real file directory system

so if you have

s3://my-bucket/my_folder/my_file.txt

then /my_folder/my_file.txt

is the object key for this file

object max size is 5TB but you can only upload 5GB in one go, to upload larger objects, you have to use multi-part upload

metadata can be added, also tags

and version id system if enabled

you can block public access if you want for a bucket

you receive an ARN or amazon resource name for the bucket

2 ways to open an S3

in the console click on object action and open

or via the url public object url

but for this you must set the permission for access to the bucket

bucket must be public access

pre-signed url – you give the client temp credentials to open the file

you can version your files, but have to enable versioning at the bucket level. Best practice is to use versioning, this protects against unintended deletes and you can roll back.

Any files existing before versioning is enabled will not have previous versions available.

if you click on objects -> list versions, you will see the available versions listed.

you can easily roll back in this way. So versioning should be enabled!

S3 Encryption

Exam question!

Know the 4 methods of encryption for S3:

SSE-S3 – encrypts S3 objects using keys managed by AWS

SSE-KMS – uses AWS key management service to manage the keys

SSE-C – to manage own keys yourself

Client-Side Encryption

important to know which is best suited for each situation!

SSE-S3

managed by AWS S3

object encrypted server side

uses AES-256 algorithm

must set header “x-amx server-side-encryption”: “AES256”

SSE-KMS

managed by AWS KMS

gives you control over users and an audit trail

object is encrypted server side

set header “x-amx server-side-encryption”: “aws:kms”

SSE-C

server side with your own keys outside of aws

so s3 does NOT store the key

https has to be used for this as you will be sending your key

the actual client side data encryption key in every http header

s3 then uses the key to encrypt the data on s3 at the bucket.

Client Side Encryption

this happens before transmitting data to s3 at the client side

and decryption on the client side.

customer fully manages the keys and encryption/decryption.

there is a client library called aws s3 encryption client which you can use on your clients.

encryption in transit ssl or tls

https is “in flight” – you always use https for your in flight: mandatory

uses ssl or tls certificates

S3 Security

User-based security

first there is user-based

uses IAM policies, which api calls are allowed from a specific user

Resource-based security

sets bucket-wide rules from S3 console, allows cross-account

not very common (NOT in exam)

oacl – object access control list acl – this is finer grain

bacl – bucket access control list acl – this is less common

IAM principal can access an S3 object if

user iam permissions allow it or the resource policy allows it

and no explicit deny exists

S3 Bucket Policies

they are json based policies

actions: they allow a set of api to allow or deny

principle is the account or user the policy applies to

use the s3 bucket policy to

grant public access to the bucket

force encryption at upload to the bucket

grant access to another account – cross account access

Bucket settings for block public access

used to block public access

3 kinds:

new acls
any acl
new public bucket or access point policies

block public or cross-account access to buckets or objects through ANY public bucket or access point policies.

but exam will not test you on these.

created to prevent company data leaks

can also be set at account level

networking: supports vpc endpoints

S3 Logging and Audit

s3 access logs can be stored in other s3 buckets
api calls can be logged in CloudTrail

user security:

MFA Delete can be required to delete objects for versioned buckets

pre-signed urls valid for a limited time only

use case

eg to download a premium product or service eg video if user is logged in as a paid up user or has purchased the vid or service

S3 Websites

s3 can host static websites for public access

url will be

bucket-name.s3-website-aws-region.amazonaws.com

if you get 403 forbidden error then make sure bucket policy allows public reads – bucket must be publicly accessible for this.

CORS Cross-Origin Resource Sharing

web browser mechanism to allow requests to other origins while visiting the main origin

eg http://example.com/app1 and http://example.com/app2

a CORS header is needed for this – and the other origin must also allow the request.

the web-browser does a “pre-flight request” first – asking the cross-origin site if it is permitted – then if yes, then eg get put delete

these are the cors method access-control-allow-methods

S3 CORS

exam question!
if a client does a cross-origin request to an “3 bucket then you must enable the correct CORS headers.

you can allow for a specific origin, or for * ie all origins

S3 MFA Delete

forces a user to generate a code from a device eg mobile phone before doing some operations on S3

to activate MFA-Delete enable versioning on the S3 bucket

it is required to permanently delete an object version

and to suspend versioning on the bucket

not needed to enable versioning or list deleted versions

only bucket owner ie root account can enable/disable MFA-Delete

only possible via CLI at present.

first create an access key for the bucket in the web console of iam

then configure the aws cli to use this key

download the key file, and then set up a cli with your access key id and secret access key

command:

aws configure –profile root-mfa-delete-demo

you are then prompted to enter the access key id and secret access key

then you run

aws s3 ls –profile root-mfa-delete-demo to display

then do:

aws s3api put-bucket-versioning –bucket demo-mfa-2020 –versioning-configuration Status=Enabled, MFADelete=Enabled –mfa “<here enter the arn-of-mfa-device and-the-mfa-code-for-the-device>” –profile root-mfa-delete-demo

you can then test by uploading an object

and try deleting the version – you should get a message saying you cannot delete as mfa authentication delete is enabled for this bucket…

so to delete you must use the cli mfa-delete command and your chosen device eg mobile phone mfa – or alternatively for this demo just disable the mfa-delete again. then you can delete as per usual.

To force encryption you can set a bucket policy that refuses any api “put” calls to an object that does not have encryption headers

alternatively, you can use the default encryption option of S3

important for exam: bucket policies are evaluated *before* default encryption settings!

S3 Access Logs

– you can log to another bucket, or you can use AWS Athena

first, very important – and potential exam question!

NEVER set your logging bucket to be the monitored bucket or one of the monitored buckets! because this will create a big infinite loop! which means a huge AWS bill!

always keep logging bucket and the monitored bucket/s separate! – ie set a separate different target bucket that is NOT being logged!

tip: make sure you define a bucket with the word “access-logs” or similar in the name, so that you can easily identify your login bucket to avoid logging it by mistake.

S3 Replication – Cross Region (CRR) and Single Region (SRR)

– must enable versioning for this

the copying is asynchronous

buckets can belong to different accounts

must grant proper iam permissions to S3 for this

CRR: you synchronize against different regions

used for compliance, lower latency access, replicating across different accounts

SRR: used for log aggregation, live replication between eg production and test and development accounts

note: after activating only the new objects get replicated. to replicate existing objects, you need to use…

S3 Batch Replication feature

for DELETEs: you can replicate the delete markers from source to target

but deletions with a version id are not replicated – this is to avoid malicious deletes

and there is no “chaining” of replication…

this means eg if bucket 1 replicates to bucket 2 and 2 replicates to bucket 3, then bucket 1 is not automatically replicated to 3 – you have to explicitly set each replication for each pair of buckets.

first you have to enable versioning for the bucket

then create your target bucket for the replication if not already in existence – can be same region for SRR or a different region for CRR

then select in origin bucket:

management -> replication rules, you create a replication rule and you set the source and destination.

then you need to create an iam role:

and specify if you want to replicate existing objects or not

for existing objects you must use a one-time batch operation for this

S3 Pre-Signed URLs

can generate using sdk or cli

uploads: must use sdk
downloads can use cli, easy

valid fo 3600 secs ie 1hr default, can change

users are given a pre-signed url for get or put

use cases:
eg to only allow logged-in or premium uses to download a product eg video or service
allow a user temporary right to upload a file to your bucket

S3 Storage Classes

need to know for exam!

S3 offers

Standard General Purpose, and
Standard-Infrequent-Access (IA)
One Zone-IA
Glacier Instant Retrieval
Glacier Flexible Retrieval
Glacier Deep Archive
Intelligent Tiering

can move objects between classes or use S3 Lifecycle Management service

Standard S3:

Durability and Availability

difference:

S3 has very high durability, 99.9 11×9!

Availability

how readily available the service is available.

S3 standard is about 1 hr per year out of availability

big data, mobile, gaming, content distribution

Standard-IA:

less frequent access
lower cost than standard

99.9% available

good for DR and backups

1-Zone-IA

v high durability but 99.5% availability – not so high

thus best used for secondary backup copies or recreatable data

Glacier storage classes:

low-cost object storage for archiving or backup

you pay for storage plus a retrieval charge

Glacier Instant Retrieval IR

millisecond retrieval, min storage 90 days

Glacier Flexible Retrieval

1-5 mins to recover

standard 3-5 hrs

bulk 5-12 hrs – is free
min storage duration 90 days

Glacier Deep Archive

best for long-term only storage

12 hrs or bulk 48 hrs retrieval
lowest cost

min 180 days storage time

Intelligent Tiering

allows you to move objects between tiers based on monitoring
no retrieval charges

enables you to leave the moving between tiers to the system

S3 Lifecycle Rules

These automate the moving of storage objects from one storage class to another.

Lifecycle rules are in two parts: Transition Actions and Expiration Actions

Transition Actions:

configures objects for transitioning from one storage class to another.

This can be used to move objects to another class eg 60 days after creation, and to eg Glacier for archiving after 6 months

Expiration Actions:

configures objects to be deleted (“expired”) after a specified period of time

Access logs set to delete after 365 days

can also be used to delete old versions of files where you have file versioning enabled.

or to delete incomplete multi-part uploads after a specific time period

rules can be created for certain prefixes in the file object names or specific “folder” names (remember there are no real directory folders in S3!)

Exam Q:

often about image thumbnails…

Your application on EC2 generates thumbnail images from profile photos after they are uploaded to S3.

These thumbnails can be easily regenerated when needed and only need to be kept for 60 days. The source images should be able to be immediately retrieved during those 60 days. After this time period, users are ok with waiting up to 6 hours.

How would you design a lifecycle policy to allow for this?

Solution:

store the source images on Standard S3, with a lifecycle config to transition them to Glacier after 60 days.

The S3 thumbnails that are generated by the application can be stored on One-Zone-IA, with a lifecycle config to expire ie delete them after 60 days.

Another Exam Q scenario:

A company rule states you should be able to recover your deleted S3 objects immediately for 30 days, though this happens rarely in practice.

After this time, and for up to 365 days, deleted objects should be recoverable within 48 hours.

Solution:

First enable S3 Versioning so you will have multiple object versions, and so that “deleted” objects are hidden by a “delete market” and can be recovered if necessary.

Then, create a lifecycle rule to move non-current versions of the objects to Standard-IA

and then transition these non-current versions from there to Glacier Deep Archive later on.

How To Calculate The Optimum Number of Days To Transition Objects From One Storage Class To Another:

You can use S3 Storage Class Analytics which will give you recommendations for Standard and Standard-IA

But note – exam Q This does NOT work for One-Zone-IA or Glacier

The S3 Storage Class Analytics Report is updated daily, after 24-48 hours you will start to see the data analysis results

This is a useful tool for working out your optimum storage class lifecycle rule according to your actual real storage patterns.

1 2 Next →