0% found this document useful (0 votes)
219 views

AWS Challenge Project

This document outlines a project to design and implement an AWS EBS root volume disaster recovery solution that meets RTO and RPO requirements. The solution includes taking automated snapshots of the EBS root volume, configuring CloudWatch monitoring and SNS notifications, simulating a disaster scenario by deleting application files, and recovering by replacing the root volume with a recent snapshot. The implementation is split into phases using Terraform, Ansible, and scripts to automate infrastructure provisioning, snapshotting, monitoring, and disaster recovery.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
219 views

AWS Challenge Project

This document outlines a project to design and implement an AWS EBS root volume disaster recovery solution that meets RTO and RPO requirements. The solution includes taking automated snapshots of the EBS root volume, configuring CloudWatch monitoring and SNS notifications, simulating a disaster scenario by deleting application files, and recovering by replacing the root volume with a recent snapshot. The implementation is split into phases using Terraform, Ansible, and scripts to automate infrastructure provisioning, snapshotting, monitoring, and disaster recovery.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Page |0

https://linkedin.com/in/prafulpatel16

https://github.com/

https://medium.com/@prafulpatel16

Date: June 21, 2022


AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

 Project:

 Project Description:

Application Name: Praful’s webportfolio application

Cloud: AWS Cloud


Cloud Services: AWS EC2, AWS EBS, Volume, Snapshots, CloudWatch, SNS
WebServer: apache webserver

An IT services provider, PRAfect Systems Inc., is engaged in providing Cloud/DevOps & software
development solutions. The company recently migrated its entire workload to the AWS Cloud. All
the workload has been running on the EC2 virtual machine where application server is configured
and web application is accessed through this server. They have configure the monitoring system
with AWS Cloudwatch and integrated a SNS notification as well through which cloud engineer
received a notification whenever there is a SystemCheck Failed for EC2 machine.

One morning cloud engineer received a System failure notification in to email, it was about EC2
machine root volume got corrupted due to some wrong system patching and hence it got system
check instance failed via monitoring system.

1
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

RTO & RPO Requirements:


In order to maintain a business continuity the requirement is to maintain an RPO & RTO ratio is
most critical and essential during the disaster condition.
RTO = 10 min. must meet the downtime and acceptable in order to recover the system from
failure.
RPO = 05 min. must meet and system should be able to get back and recover the backup within
the last 05 min.

This project demonstrates an experience of designing and implementing of Disaster recovery


scenario which can fulfil the defined business RPO & RTO requirements, along with cloudwatch
monitoring and SNS notification system.

 Project Cost Estimation:


(Note: This cost is Not any actual cost, it’s just an estimation based on high level requirement. Price may be vary
based on adding and removing services based on requirement.)

 Tools & Technologies covered:

AWS Cloud
AWS Identity & Access Management (IAM)
AWS EC2 Machine
AWS Cloudwatch
AWS SNS
Terraform (Automated Cloud Provisioning Tool)
Ansible
Visual studio code IDE
GitHub
GitBash
Draw.io

2
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Amazon Simple Notification Service


(Amazon SNS)

AmazonElastic
Amazon Elastic Block
BlockStore
Store
(Amazon EBS)
(Amazon EBS) Volume Snapshot

Amazon Elastic Compute Alarm


Cloud (Amazon EC2)
Alarm
Amazon CloudWatch Email notification Metrics Insights

Resilience in Amazon EC2


https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/disaster-recovery-resiliency.html

https://disaster-recovery.workshop.aws/en/intro/disaster-recovery.html

3
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

In addition to the AWS global infrastructure, Amazon EC2 offers the following features to support your
data resiliency:

1. Copying AMIs across Regions


2. Copying EBS snapshots across Regions
3. Automating EBS-backed AMIs using Amazon Data Lifecycle Manager
4. Automating EBS snapshots using Amazon Data Lifecycle Manager
5. Maintaining the health and availability of your fleet using Amazon EC2 Auto Scaling
6. Distributing incoming traffic across multiple instances in a single Availability Zone or multiple
Availability Zones using Elastic Load Balancing

 Instance status checks


Instance status checks monitor the software and network configuration of your individual
instance. Amazon EC2 checks the health of the instance by sending an address resolution
protocol (ARP) request to the network interface (NIC). These checks detect problems that
require your involvement to repair. When an instance status check fails, you typically must
address the problem yourself (for example, by rebooting the instance or by making
instance configuration changes).

The following are examples of problems that can cause instance status checks to fail:

Failed system status checks


Incorrect networking or startup configuration
Exhausted memory
Corrupted file system
Incompatible kernel

4
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

 Solution Architecture:

AWS EBS Root Volume Disaster Recovery by RPO & RTO Architecture

5
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

RPO & RTO Architecture

This project will be completed in following implementation phases.


 Project implementation Phase:

Phase 1: Deploy EC2 machine using terraform automation


o Write terraform script to launch EC2 machine
o Write main.tf, variable.tf and output.tf
o Prepare user data webserver and application source code packages in shell script
o Add user data file within the terraform configuration
o Verify that web application is successfully accessed from web browser
Phase 2: Take a snapshot of root volume manual way.
o Go to snapshots and take snapshot of existing root volume A
o Verify that snapshot process is complete
OR
Phase 2.1: Take a snapshot of root volume Ansible automated way.
o Go to VS code IDE
o Gather the instance and volume information manually
o Write snapshot yaml file

6
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

o Run Ansible playbook file


o Verify from the AWS console that snapshot has been created.

Phase 3: Configure Cloudwatch monitoring & SNS topic for failure notification
o Go to Cloudwatch and create an Alarm
o Select an EC2 metric: StatusFailedCheck_System
o Create a new SNS Topic and provide an email address
o Complete the cloud watch process
o Go to SNS Topic and confirm the subscription by verifying the link

Phase 4: Simulate and Trigger a Disaster recovery scenario.


o Prepare manual system failure script
o Write a script to remove an application files from the apache root directory
/var/www/html/
o Run the script from EC2 machine.
o Verify that all application files removed
o Verify that web application is not accessible.

Phase 5: Simulate Cloudwatch monitoring ‘In-Alarm”


o Login to EC2 machine.
o Become a root user
o Configure aws configure
o Run the cloudwatch set-alarm script to put into “In-Alarm” status
o Go to Cloudwatch and verify that status is turned from “OK” to “In-Alarm”
o Go to email and verify that email is received with necessary information.

Phase 6: Recover from Disaster condition (RPO 5 min. RTO 10 min.)


o Go to EC2 machine
o Go to Action – make sure that ec2 machine is running.
o Select an option “Monitor and troubleshoot”
o Select an option “Replace root volume”
o Select a recent snapshot taken within last 5 minutes to meet the RPO condition
o Attach complete the snapshot
o Go to Volume and verify that the new snaphost volume is attached and “ In-Use”
status
o Verify that old root volume is “Available” status which is no use and corrupted
now.
o Verify that web application is now accessible

 Pre-Requisite:

7
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

o VS Code installed and configured in windows


o Terraform installed and configured in VS code
o AWS IAM user account with “AWSEC2FullAccess” permission
o User-data script ready for webapp source code
o bash script for application removal

AWS IAM user account with “AWSEC2FullAccess” permission

Create a New IAM user with EC2FullAccess permission with programmatic access

 Implementation in an Action:

Phase 1: Deploy EC2 machine using terraform automation


o Write terraform script to launch EC2 machine
o Write main.tf, variable.tf and output.tf

8
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

o Prepare user data webserver and application source code packages in shell script
o Add user data file within the terraform configuration
o Verify that web application is successfully accessed from web browser

Terraform init

Terraform plan

Terraform apply

9
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Apply complete

10
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Verify from aws console if ec2 instance is launched

Verify that web application is accessible from browser

11
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

https://github.com/prafulpatel16/terraform-projects-aws.git

Push source code to github

12
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Verify that code updated to the github

Phase 2: Take a snapshot of root volume


o Go to snapshots and take snapshot of existing root volume A
o Verify that snapshot process is complete

Volume

Go to Volume

13
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Take a snapshot

14
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Snapshot complete

15
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

OR

Phase 2.1: Take a snapshot of root volume Ansible automated way.


o Go to VS code IDE
o Gather the instance and volume information manually
o Write snapshot yaml file
o Run Ansible playbook file
o Verify from the AWS console that snapshot has been created.

Go to VS code IDE

Gather the instance and volume information manually

aws_region:

Instance id:

16
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Device_name:

Write main snapshot yaml file

Run Ansible playbook file

Verify from the AWS console that snapshot has been created.

17
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Phase 3: Configure Cloudwatch monitoring & SNS topic for failure notification
o Go to Cloudwatch and create an Alarm
o Select an EC2 metric: StatusFailedCheck_System
o Create a new SNS Topic and provide an email address
o Complete the cloud watch process
o Go to SNS Topic and confirm the subscription by verifying the link

System Monitoring

Configure Cloudwatch

1. Go to Cloudwatch “Alarm – Create Alarm”


2. Select Metric – EC2
3. Select Per-Instance Metrics
4. Find the metric name: StatusCheckFailed_System
5. Create New SNS Topic
6. Complete the Cloudwatch process
7. Confirm SNS Subscription

Create Alarm

18
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

19
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

20
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Create a New SNS Topic

21
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Go to EC2 Action

Choose: Recover this Instance

22
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Go to SNS Service to confirm

23
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Confirm Subscription

Go to Gmail and grab the url

Click to confirm subscription

24
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Go to Cloudwatch and observe the Alarm status

25
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Phase 4: Simulate and Trigger a Disaster recovery scenario.


o Prepare manual system failure script
o Write a script to remove an application files from the apache root directory
/var/www/html/
o Run the script from EC2 machine.
o Verify that all application files removed
o Verify that web application is not accessible.

Login to EC2 machine

AWS configure with new user into EC2 machine

Aws configure

26
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Simulate Web Server app failure by removing application code:

27
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Create an app remove script

Run the script

28
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Verify all files removed from root directory

Verified that web application removed

Phase 5: Simulate Cloudwatch monitoring ‘In-Alarm”


o Login to EC2 machine.
o Become a root user
o Configure aws configure
o Run the cloudwatch set-alarm script to put into “In-Alarm” status
o Go to Cloudwatch and verify that status is turned from “OK” to “In-Alarm”
o Go to email and verify that email is received with necessary information.

Simulate the EC2 webServer System Failure by CloudWatch

Become a root user

Sudo su –

Prepare a simulation script which can simulate the Alarm in status

cloudwatch alarm trigger:


aws cloudwatch set-alarm-state \
--alarm-name "WebServer_Alarm" \
--state-value ALARM \

29
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

--state-reason "Simulate an EC2 HW failure"

Email Received

Cloudwatch Alarm status changed to In-Alarm

30
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Phase 6: Recover from Disaster condition (RPO 5 min. RTO 10 min.)


o Go to EC2 machine
o Go to Action – make sure that ec2 machine is running.
o Select an option “Monitor and troubleshoot”
o Select an option “Replace root volume”
o Select a recent snapshot taken within last 5 minutes to meet the RPO condition
o Attach complete the snapshot
o Go to Volume and verify that the new snaphost volume is attached and “ In-Use”
status
o Verify that old root volume is “Available” status which is no use and corrupted
now.
o Verify that web application is now accessible

31
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Recover Root Instance from Failure

Go to Snapshot

Verify that snapshot is available

Attach new volume to WebServer EC2 instance

Go to EC2

Replace root volume

32
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

New Volume created and attached to ec2 and “In-Use” and older one is in “Available”

33
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Go to Web browser and verify that web application is up and running again after recovery

Go to Monitoring and verify that Cloudwatch Alarm is in “OK” Status

34
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Congratulations!!!! 🔥🚀

35
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Clean up Project:

Terraform destroy

Remove volume

36
AWS PROJECT: AWS-EBS ROOT VOLUME DISASTER RECOVERY, MONITORING & NOTIFICATION
SOLUTION DESIGN & IMPLEMENTATION BY: PRAFUL PATEL

Remove snaphosts

Remove Cloudwatch Alarm

Remove SNS Topic

Resources:
https://wellarchitectedlabs.com/reliability/300_labs/300_testing_for_resiliency_of_ec2_rds_and_s3/6_failure_injection_app/
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-system-instance-status-check.html
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/replace-root.html#view-replacement-tasks
https://github.com/terraform-aws-modules/terraform-aws-cloudwatch
https://docs.ansible.com/ansible/2.5/modules/ec2_snapshot_module.html

Congratulations!!!! 🔥🚀

37

You might also like