Tags Archives: CloudWatch

AWS CloudTrail, CloudWatch, and Config Compared

CloudTrail is a services which provides governance, compliance, and auditing of your AWS account by logging and monitoring account activities.

 

What’s the difference between CloudWatch and CloudTrail

 

AWS CloudWatch

 

CloudWatch is a monitoring tool used for real-time monitoring of AWS resources and applications. It provides a monitoring service that analyzes the performance of the system.  

 

CloudWatch can be used to detect irregular behaviour in AWS environments. It monitors various AWS resources including EC2, RDS,  S3, Elastic Load Balancer, etc. It can also be used with CloudWatch Alarms. 

 

 

2. AWS CloudTrail

 

CloudTrail is a service that enables governance, compliance, operational auditing, and risk auditing of your AWS account. It continuously logs and monitors the activities and actions across your AWS account. It provides the event history of your AWS account including data about who is accessing your system.  Remediation actions can also be taken by CloudTrail.      

 

While CloudWatch reports on the activity and health and performance of your AWS services and resources,  CloudTrail by contrast is a log of all the actions that have taken place inside your AWS environment.

 

CloudTrail can record API activity in your AWS account and reports an event within 15 minutes of the API call.

 

 

It provides auditing services for AWS accounts. In CloudTrail, Logs are saved in an S3 bucket.

 

However, you can receive notifications of specific CloudTrail events immediately by sending them via the CloudWatch Event Bus.

 

While CloudTrail only writes to your S3 bucket once every five minutes, it sends events to the CloudWatch Event bus in near real-time as these API calls are observed.

 

CloudWatch monitors performance. For example, tracking metrics of an EC2 instance or keeping track of your Amazon DynamoDB performance or seeing how Amazon S3 is performing. CloudWatch allows you to collect default metrics for over 70 AWS services.

 

It also has a “Custom Metrics” feature that enables you to collect a metric that is specifically important to your system. For example, to measure how people are using your application.

 

 

AWS CloudTrail

 

AWS CloudTrail is principally used for auditing API activity, tracking who did what and when, and securely logging this information to Amazon S3 for later analysis.

 

Thus CloudTrail keeps track of what is done in your AWS account, when, and by whom. For example, with CloudTrail you can view, search, and download the latest activity in your AWS account to check it there are any abnormal or unusual actions and if so, by whom. This type of reporting is called auditing and it is the core service of CloudTrail.

 

 

CloudTrail tracks data events and management events:

 

Data events are object-level API requests made to your resources. For example, when an item is created or deleted in a DynamoDB table.

 

Management events log changes (mostly creation or deletion changes) to your environment, such as the creation or deletion of the entire DynamoDB itself.

 

CloudTrail tracks which applications or persons took these actions and stores the details in logfiles. These logfiles are encrypted and stored in S3.

 

 

Note that CloudWatch has CloudWatch Alarms which you can configure and metric data is retained for 15 months. CloudTrail on the other hand has no native alarms. However, you can configure CloudWatch Alarms for CloudTrail, but you have to store logs in S3.

 

In a nutshell:

 

CloudWatch is for performance. Think of CloudWatch as monitoring application metrics.

CloudTrail is for auditing. Think of CloudTrail as tracking API activity within an account.

 

 

 

 

AWS Config vs. CloudTrail

 

In the configuration and monitoring category AWS, there are two major AWS monitoring tools that are similar and are easy to confuse. They are AWS Config and AWS CloudTrail.

 

Config and CloudTrail are different tools with different purposes.

 

What is AWS Config?

 

AWS Config is a service that lets you set configuration rules for your AWS resources to comply with. It then tracks whether the resources comply with those rules.

 

Whenever a resource has changed, Config records the change in a configuration history in an S3 bucket. It stores a snapshot of the system at a regular period of time set by you. It also has a dashboard that presents an overview of your resources and their configurations.

 

 

What is AWS CloudTrail?

 

CloudTrail is a logging service that records all API calls made to any AWS service. It records the details of the API call such as which user or application made the call, the time and date it happened and the IP address it originated from.

 

There is also another AWS logging service called CloudWatch Logs, but unlike CloudWatch Logs which reports application logs, CloudTrail reports on how AWS services are being used in your environment.

 

Where CloudTrail and Config are Similar

 

Config and CloudTrail have a number of things in common.

 

Both are monitoring tools for your AWS resources. Both track changes and store a history of what happened to your resources in the past. Both are used for compliance and governance, auditing and security policies.

 

 

If you notice something unusual or going wrong with your AWS resources, then chances are you’ll see it reported in both CloudTrail and Config.

 

Where CloudTrail and Config are Different

 

 

Note that AWS Config Rules is not a cheap service. There is no free tier, you pay a fee per config item per region.

 

Though both often report on the same events, their approach is different. Config reports on what has changed in the configuration, whereas CloudTrail reports on who made the change, and when, and from which IP address.

 

Config reports on the configuration of your AWS resources and creates detailed snapshots of how your resources have changed.

 

CloudTrail focuses on the events or API calls behind those changes, focusing on users, applications, and activities performed in your environment.

 

Where CloudTrail and Config work together

 

By taking a different approach to the same events, CloudTrail and Config make a good combination. Config is a great starting point for ascertaining what has happened to your AWS resources, while CloudTrail can give you more information from your CloudTrail logs.

 

Config watches and reports on instances of rules for your resources being violated. It doesn’t actually allow you to make changes to these resources from its own console.

 

By contrast, CloudTrail gives you more control by integrating with CloudWatch Events to allow you to set automated rule-based responses to any event affecting your resources.

 

In the case of security breaches, if multiple changes have been made by an attacker in a short period of time, Config might not report this in detail.

 

Config stores the most recent and important changes to resources but disregards smaller and more frequent changes.

CloudTrail by contrast records every single change in its logs. It also has an integrity validation feature that checks if the intruder or attacker manipulated the API logs to cover their activity track.

 

 

Should You Use AWS Config or CloudTrail for Security?

 

 

Both Config and CloudTrail have a role to play together. Config records and notifies about changes in your environment. CloudTrail helps you find out who made the change, from where, and when.

 

A good way to think of it is that AWS Config will tell you what your resource state is now or what it was at a specific point in the past whereas CloudTrail will tell you when specific events in the form of API calls have taken place.

 

So you ought to use both. Config Rules triggers on a change in the status of your system, but it will often only give you an update on the state of the system itself.

 

CloudTrail meanwhile provides you with a log of every event which details everything that has taken place and when and by whom. This helps identify all the causes that may have led to the security problem in the first place.

 

 

Remember also that AWS Config Rules does not prevent actions from happening – it is not a “deny”.

 

But – you can do “remediations” of resources that are identified as non-compliant. This can be done for example via SSM Automation Documents. Config then triggers an auto-remediation action that you define.

 

Notifications:

 

you can use EventBridge to receive notifications from Config, from there you can also send the notifications onto eg Lambda functions, SNS or SQS.

Continue Reading

AWS CloudWatch Monitoring Overview

 

AWS CloudWatch is the basic AWS monitoring service that collects metrics on your resources in AWS, including your applications, in real time.

 

You can also collect and monitor log files with AWS CloudWatch. You can set alarms for metrics in CloudWatch to continuously monitor performance, utilization, health, and other parameters of your AWS resources and take action when metrics cross set thresholds.

 

CloudWatch is a global AWS service, so it can monitor resources and services across all AWS regions via a single dashboard.

 

 

CloudWatch provides basic monitoring free of charge at 5-minute intervals as a serverless AWS service, thus there is no need to install any additional software to use it.

 

 

For an additional charge, you can set detailed monitoring that provides data at 1-minute intervals.

 

 

AWS CloudWatch has a feature that allows you to publish and retain custom metrics for a 1-second or 1-minute duration for your application, services, and resources, known as high-resolution custom metrics.

 

CloudWatch stores metrics data for 15 months, so even after terminatíng an EC2 instance or deleting an ELB, you can still retrieve historical metrics for these resources.

 

 

How CloudWatch Works

 

CW Monitoring Is Event-Driven

 

All monitoring in AWS is event-driven. An event is “something that happens in AWS and is captured.”

 

For example, when a new EBS volume is created, the createVolume event is triggered, with a result of either available or failed. This event and its result are sent to CloudWatch.

 

You can create a maximum of 5000 alarms in every region in your AWS account.

 

You can create alarms for functions such as starting, stopping, terminating, or recovering an EC2 instance, or when an instance is experiencing a service issue.

 

Monitoring Is Customizable

 

You can define custom metrics easily. A custom metric behaves just like a predefined one and can then be analyzed and interpreted in the same way as standard metrics.

One important limitation of CloudWatch – exam question! 

 

CloudWatch functions below the AWS Hypervisor, which means it functions below the virtualization layer of AWS.

 

This means it can report on things like CPU usage and disk I/O…but it cannot see beyond what is happening *above* that layer.

 

This means CloudWatch CANNOT tell you what tasks or application processes are affecting performance. Remember this point!

 

Thus it cannot tell you about disk usage, unless you write code that checks disk usage and send that as a custom metric to CloudWatch.

 

This is an important aspect that can appear in the exam. You might be asked if CloudWatch can report on memory or disk usage by default; it cannot.

 

Monitoring Drives Action

 

The final piece of the AWS monitoring puzzle is alarms – this is what occurs after a metric has reported a value or result outside a set “everything is okay” threshold.

 

When this happens, an alarm is triggered. Note that an alarm is not necessarily the same as “something is wrong”; an alarm is merely a notification that something has happened at a particular point.

 

For example, it could be running some code in Lambda, or sending a message to an Auto Scaling group telling it to scale in, or sending an email via the AWS SNS message service.

 

Think of alarms as saving you from having to sit monitoring the CloudWatch dashboard 24×7.

 

One of your tasks as SysOp is to define these alarms.

 

 

CloudWatch Is Metric- and Event-Based

 

Know the difference between metrics and events.

An event is predefined and is something that happens, such as bytes coming into a network interface.

 

The metric is a measure of that event eg how many bytes are received in a given period of time.

 

Events and metrics are related, but they are not the same thing.

 

CloudWatch Events Are Lower Level

 

An event is something that happens, usually a metric changing or reporting to CloudWatch, but at a system level.

 

An event can then trigger further action, just as an alarm can.

 

Events are typically reported constantly from low-level AWS resources to CloudWatch.

 

CloudWatch Events Have Three Components

 

CloudWatch Events have three key components: events, rules, and targets.

 

An event:

 

the thing being reported. Events describe change in your AWS resources. They can be thought of as event logs for services, applications and resources.

 

A rule:

 

 

an expression that matches incoming events. If the rule matches an event, then the event is forwarded to a target for processing.

 

 

A target:

 

 

is another AWS component, for example, a piece of Lambda code, or an Auto Scaling group, or an email or SNS/SQS message that is sent out.

 

 

Both alarms and events are important and it is essential to monitor both.

 

CloudWatch Namespaces

 

A CloudWatch Namespace is a container for a collection of related CloudWatch metrics. This provides for a way to group metrics together for easier understanding and recognition.

AWS provides a number of predefined namespaces, which all begin with AWS/[service].

 

Eg, AWS/EC2/CPUUtilization is CPU utilization for an EC2 instance,

 

 

AWS/DynamoDB/CPUUtilization is the same metric but for DynamoDB.

 

 

You can add your own custom metrics to existing AWS namespaces, or else create your own custom namespaces in CloudWatch.

 

 

exam question:

CloudWatch can accept metric data from 2 weeks earlier and 2 hours into the future but make sure your EC2 instance time is set accurately for this to work correctly!

 

 

Monitoring EC2 Instances

 

CloudWatch provides some important often-encountered metrics for EC2.

 

Here are some of the most common EC2 metrics which you should be familiar with for the exam:

 

 

CPUUtilization – one of the fundamental EC2 instance metrics. It shows the percentage of allocated compute units currently in use.

 

DiskReadOps – reports a count of completed read operations from all instance store volumes.

 

DiskWriteOps – opposite of DiskReadOps, reports a count of completed read operations from all instance store volumes.

 

DiskReadBytes – reports the bytes read from all available instance store volumes.

 

DiskWriteBytes – reports the total of all bytes written to instance store volumes.

 

NetworkIn – total bytes received by all network interfaces.

 

NetworkOut – total bytes sent out across all network interfaces on the instance.

 

NetworkPacketsIn – total number of packets received by all network interfaces on the instance (available only for basic monitoring).

 

NetworkPacketsOut – number of packets sent out across all network interfaces on the instance. Also available only for basic monitoring.

 

 

 

S3 Metrics

 

There are many S3 metrics, but these are the most common ones you should know:

BucketSizeBytes – shows the daily storage of your buckets as bytes.

NumberOfObjects – the total number of objects stored in a bucket, across all storage classes.

 

AllRequests – the total number of all HTTP requests made to a bucket.

 

GetRequests – total number of GET requests to a bucket. There are also similar metrics for other requests: PutRequests , DeleteRequests , HeadRequests , PostRequests , and SelectRequests.

 

BytesDownloaded – total bytes downloaded for requests to a bucket.

 

BytesUploaded – total bytes uploaded to a bucket. These are the bytes that contain a request body.

 

FirstByteLatency – per-request time for a completed request, by first-byte millisecond.

 

TotalRequestLatency – the elapsed time in milliseconds from the first to the last byte of a request.

 

 

 

CloudWatch Alarms

 

 

Alarms Indicate a Notifiable Change

 

 

A CloudWatch alarm initiates action. You can set an alarm for when a metric is reported with a value outside of a set level.

 

Eg, for when your EC2 instance CPU utilization reaches 85 percent.

 

 

Alarms have three possible states at any given point in time:

 

OK : means the metric lies within the defined threshold.

ALARM : means the metric is below or above the defined threshold.

 

INSUFFICIENT_DATA : can have a number of reasons. The most common reasons are that the alarm has only just started or been created, that the metric it is monitoring is not available for some reason, or there is not enough data at this time to determine whether the alarm is OK or in ALARM state.

 

 

CloudWatch Logs

 

CloudWatch Logs stores logs from AWS systems and resources and can also handle the logs for on-premises systems provided they have the Amazon Unified CloudWatch Agent installed.

 

If you are monitoring AWS CloudTrail activity through CloudWatch, then that activity is sent to CloudWatch Logs.

 

If you need a long retention period for your logs, then CloudWatch Logs can also do this.

 

By default logs are kept forever and never expire. But you can adjust this based on your own retention policies.

You can choose to keep logs for only a single day or go up to 10 years.

 

Log Groups and Log Streams

 

You can group logs together that serve a similar purpose or from a similar resource type. For
example, EC2 instances that handle web traffic.

 

 

Log streams refer to data from instances within applications or log files or containers.4

 

 

CloudWatch Logs can send logs to S3, Kinesis Data Streams and Kinesis Data Firehose, Lambda and ElasticSearch

 

 

CloudWatch Logs – sources can be:

 

SDKs,

CloudWatch Logs Agent,

CloudWatch Unified Agent

Elastic Beanstalk

ECS – Elastic Container Service

Lambda function logs

VPC Flow Logs – these are VPC specific

API Gateway

CloudTrail based on filters

Route53 – logs DNS queries

 

 

Define Metric Filters and Insights for CloudWatch Logs

You can apply a filter expression eg to look for a specific IP in a log or the number of occurrences of “ERROR” in the log

Metric filters can be used to trigger CloudWatch Alarms

 

CloudWatch Logs Insights can be used to query logs and add queries to CloudWatch Dashboards

 

 

CloudWatch Logs – Exporting to S3

 

 

NOTE

this can take 12 hours for the data to become available for export – so it is not real time. For this you should use Log Subscriptions.

 

 

The API call for this is “CreateExportTask”

 

 

 

CloudWatch Log Subscriptions

 

You apply a “subscription filter” to the CloudWatch Log before sending it to eg a Lambda function managed by AWS/or to a custom-designed Lambda function and then from there as real-time data on to eg ElasticSearch. Or, you might send it from Subscription Filter and then to Kinesis.

 

 

You can also send or aggregate logs from different accounts and different regions to a subscription filter in each region and from there to a common single Kinesis Data Stream and Firehose and from there in near-real time on to eg S3.

 

 

 

 

 

 

 

Unified CloudWatch Agent

 

 

The AWS Unified CloudWatch Agent provides more detailed information than the standard free CloudWatch service.

 

You can also use it to gather logs from your on-premises servers in the case of a hybrid environment and then centrally manage and store them from within the CloudWatch console.

 

The agent is available for Windows and Linux operating systems.

 

When installed on a Windows machine, you can forward in-depth information to CloudWatch from the Windows Performance Monitor, which is built into the Windows operating system.

 

When CloudWatch is installed on a Linux system, you can receive more in-depth metrics about CPU, memory, network, processes, and swap memory usage. You can also gather custom logs from applications installed on servers.

 

To install the CloudWatch agent, you need to set up the configuration file.

 

 

Continue Reading