0% found this document useful (0 votes)
7 views

CC Unit 1 - 2 notes

Cloud computing delivers computing services over the Internet, allowing users to access technology on demand without managing the underlying infrastructure. It is characterized by abstraction and virtualization, with various deployment and service models like IaaS, PaaS, and SaaS. The NIST model outlines key characteristics of cloud computing, including on-demand self-service and resource pooling, while also highlighting advantages such as cost efficiency and scalability, alongside concerns like security risks and vendor lock-in.

Uploaded by

prabhanjan.cs22
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

CC Unit 1 - 2 notes

Cloud computing delivers computing services over the Internet, allowing users to access technology on demand without managing the underlying infrastructure. It is characterized by abstraction and virtualization, with various deployment and service models like IaaS, PaaS, and SaaS. The NIST model outlines key characteristics of cloud computing, including on-demand self-service and resource pooling, while also highlighting advantages such as cost efficiency and scalability, alongside concerns like security risks and vendor lock-in.

Uploaded by

prabhanjan.cs22
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Introduction to Cloud Computing

🔹 What is Cloud Computing?


Cloud computing refers to the delivery of computing services—like servers, storage, databases,
networking, software, and more—over the Internet (“the cloud”) to offer faster innovation, flexible
resources, and economies of scale.

It turns traditional computing services into a self-service utility, just like electricity or water.
Users can access technology on demand, without needing to understand or manage the
underlying infrastructure.

🌥️ Two Core Concepts of Cloud Computing:


1.​ Abstraction​

○​ Hides the complexity of system implementation from users and developers.


○​ Users don’t know where apps run or where data is stored.
○​ System administration is outsourced, and user access is universal.
2.​ Virtualization​

○​ Combines resources into a single system that can be shared.


○​ Enables dynamic provisioning of systems and storage.
○​ Pay-per-use model, scalable, supports multiple users (multi-tenancy).

🔹 Evolution of Cloud Computing


●​ Not just the Internet renamed: Though the Internet and intranet use cloud-like
abstraction in diagrams, cloud computing represents a new model.
●​ Utility Computing Dream: The idea of computing as a utility has been around for
decades and is now being realized thanks to enabling technologies.

🔹 Key Real-World Examples


●​ Google: Offers Software as a Service (SaaS) through free apps backed by their global
infrastructure.
●​ Microsoft Azure: A Platform as a Service (PaaS) for .NET developers to run
applications online.
●​ Amazon Web Services (AWS): An Infrastructure as a Service (IaaS) offering virtual
machines and storage on demand.

🌐 Cloud Types
To better understand cloud computing, we divide it into two main categories:

1. Deployment Models

Define where the cloud infrastructure is located and who manages it.

●​ Public Cloud: Open to the public or a large group. Owned by service providers (e.g.,
AWS, Azure).
●​ Private Cloud: Exclusively used by a single organization. Managed internally or by a
third party.
●​ Hybrid Cloud: A combination of two or more clouds (public, private, community) that
remain separate but are linked.
●​ Community Cloud: Shared infrastructure for a specific group or organization with
shared concerns (e.g., government agencies).

🔍 Example: The U.S. Government’s Apps.gov is a community cloud serving federal agencies.
2. Service Models

Define what type of service is being offered on the cloud.

●​ Infrastructure as a Service (IaaS)​

🧱
Provides virtual machines, storage, and infrastructure.​

📌
You manage the OS and applications.​
Examples: Amazon EC2, Linode, RackSpace.​

●​ Platform as a Service (PaaS)​

🛠️
Provides OS, runtime, and development tools.​

📌
You deploy your apps; the provider manages the platform.​
Examples: Google App Engine, Microsoft Azure, Force.com.​

●​ Software as a Service (SaaS)​

📋
Delivers ready-to-use apps via a browser.​
You just use the software; everything else is handled.​
📌 Examples: Google Workspace, SalesForce.com, QuickBooks Online.​
This layered model is also called the SPI Model (Software, Platform, Infrastructure).

🏛️ The NIST Model


The National Institute of Standards and Technology (NIST) provides a widely accepted
framework for understanding cloud computing. It separates cloud into:

●​ Service Models: IaaS, PaaS, SaaS (explained above)


●​ Deployment Models: Public, Private, Hybrid, Community

✨ NIST’s Key Characteristics of Cloud Computing:


1.​ On-demand self-service
2.​ Broad network access
3.​ Resource pooling
4.​ Rapid elasticity
5.​ Measured service

Initially, the NIST model didn't require virtualization or multi-tenancy, but newer versions
include both. It also doesn't fully cover service brokers, provisioning, or integration services,
which are becoming more important in modern cloud computing.
📦 XaaS – Everything as a Service
Beyond IaaS, PaaS, and SaaS, many new service models are emerging:

●​ StaaS – Storage as a Service


●​ IdaaS – Identity as a Service
●​ CmaaS – Compliance as a Service

But most can be grouped under the core SPI model.

📄 1. Characteristics of Cloud Computing


🌐 Key Characteristics
1.​ On-demand Self-service​

○​ Users can automatically access computing resources like storage and processing
power without human intervention from the provider.
2.​ Broad Network Access​

○​ Services are accessible over the network via standard platforms (e.g., phones,
tablets, laptops, etc.).
3.​ Resource Pooling​

○​ Cloud resources are pooled to serve multiple users using a multi-tenant model.
Resources are dynamically assigned and reassigned based on demand.
4.​ Rapid Elasticity​

○​ Resources can be scaled up or down quickly. To users, resources appear to be


unlimited and can be purchased in any quantity at any time.
5.​ Measured Service​

○​ Cloud systems automatically control and optimize resource use through metering
(e.g., bandwidth, storage, processing). Users are billed based on usage.

🛠️ Additional Features
●​ Lower Costs: Efficient operations lead to reduced costs for users.
●​ Ease of Use: Services are typically plug-and-play.
●​ Quality of Service (QoS): Guaranteed performance levels.
●​ Reliability: Redundancy and failover systems ensure high availability.
●​ Outsourced IT: Management and maintenance are handled by the provider.
●​ Simplified Maintenance: Centralized software updates and patching.
●​ Low Barrier to Entry: Minimal upfront investment needed.

📄 2. Benefits of Cloud Computing


✅ Major Advantages
1.​ Cost Efficiency​

○​ Reduced capital expenses. Pay only for what you use.


2.​ Scalability and Flexibility​

○​ Easily adjust computing resources as business needs grow or shrink.


3.​ Accessibility​

○​ Access applications and data from anywhere with an internet connection.


4.​ Disaster Recovery​
○​ Cloud-based backup solutions simplify recovery in case of system failure.
5.​ Automatic Updates​

○​ Providers handle software updates and security patches.


6.​ Collaboration Efficiency​

○​ Teams can access, edit, and share documents in real time, from anywhere.
7.​ Environmentally Friendly​

○​ Shared resources lead to less energy consumption and carbon output.


8.​ Faster Deployment​

○​ Services and applications can be deployed quickly.


9.​ High Availability​

○​ Most providers offer 99.9% uptime and robust disaster recovery options.

📄 3. Disadvantages of Cloud Computing


⚠️ Key Concerns
1.​ Limited Control​

○​ Users may have less control over infrastructure and services compared to
on-premise systems.
2.​ Security and Privacy Risks​

○​ Data stored offsite is vulnerable to breaches, government surveillance, and


mismanagement.
3.​ Internet Dependency​

○​ Cloud access requires a stable and fast internet connection.


4.​ Latency Issues​

○​ WAN-based services may experience delays in high-speed, data-heavy


operations.
5.​ Compliance and Legal Risks​

○​ Regulations like GDPR, HIPAA, and SOX may be difficult to comply with due to
data crossing borders.
6.​ Downtime​
○​ Even top providers can experience outages, affecting availability.
7.​ Vendor Lock-In​

○​ Migrating from one provider to another can be complex and expensive.


8.​ Customization Limitations​

○​ SaaS applications may lack the flexibility of custom-built on-premise software.


9.​ Performance Variability​

○​ Shared infrastructure can lead to performance fluctuations, especially during


peak usage.

📘 Understanding Abstraction and Virtualization


🧩 1. Using Virtualization Technologies
✅ Definition
●​ Virtualization is the process of abstracting physical resources (like CPU, memory,
storage, and network) into logical, manageable units.
●​ It enables resource pooling and efficient resource management in cloud computing.

🧠 Key Concept
●​ Virtualization allows multiple virtual systems to run on a single physical system.
●​ Users access cloud services through virtualized interfaces, not the actual physical
machines.

💡 How Virtualization Works


Concept Description

Logical Naming Physical resources are given logical names and accessed through
pointers.

Dynamic The link between virtual and physical resources is flexible and
Mapping responsive to load changes.
Facile Changes Mapping can be updated instantly without service interruption.

📚 Types of Virtualization in Cloud Computing


Type Description

Access Users can access cloud services from anywhere via virtual interfaces.

Application Multiple instances of an application run in the cloud and requests are routed
based on load.

CPU Physical CPUs are divided into virtual machines or workloads are distributed
using load balancing.

Storage Data is distributed across multiple storage devices and replicated for
availability.

🔄 Mobility Patterns in Virtualization


These patterns define how workloads move between environments:

Pattern Meaning

P2V Physical to Virtual

V2V Virtual to Virtual

V2P Virtual to Physical

P2P Physical to Physical

D2C Datacenter to Cloud

C2C Cloud to Cloud

C2D Cloud to Datacenter

D2D Datacenter to
Datacenter

🧱 Gartner’s Five Cloud Attributes Enabled by Virtualization


1.​ Service-Based – Abstracted through interfaces.
2.​ Scalable & Elastic – Adjusts based on demand.
3.​ Shared Services – Resource pooling.
4.​ Metered Usage – Pay-as-you-use model.
5.​ Internet Delivery – Access through internet protocols.

🌐 2. Load Balancing and Virtualization


🚦 What is Load Balancing?
●​ Distributes workloads across multiple resources (servers, networks, apps).
●​ Ensures high availability, fault tolerance, and efficient performance.

🛠️ Load Balancing Techniques


Type Description

Hardware-Based Devices like F5 BigIP, Cisco ADCs.

Software-Based Tools like Apache mod_proxy_balancer, Pound,


Squid.

🎯 What Can Be Load Balanced?


●​ Network Services: DNS, FTP, HTTP
●​ Connections: Using intelligent switches
●​ Processing: By server allocation
●​ Storage: Across devices
●​ Application Access: Routes user sessions

⚙️ Load Balancing Algorithms


Algorithm Function

Round Robin Cycles through resources equally.

Weighted Round Robin Considers resource capacity.

Least Connections Chooses server with fewest active


connections.

Fastest Response Based on latency.


Time
Custom Based on workload, health, priority, etc.

🔁 Session Persistence
Maintains user sessions across load-balanced systems using:

●​ Session Cookies (client-side)


●​ Server-Side DB Replication
●​ URL Rewrite Engines

💎 Advanced Load Balancers / Application Delivery Controllers (ADCs)


●​ ADC = Load Balancer + Application Layer Control
●​ Functions:
○​ Health checks
○​ Traffic shaping & filtering
○​ Data compression
○​ TCP offload
○​ Authentication
○​ SSL termination

🏢 Examples of ADC Vendors:


●​ F5 Networks, Cisco, Citrix, Akamai, Juniper, Barracuda, A10 Networks

☁️ 3. Case Study: Google Cloud Infrastructure


🌍 Why Google is a Benchmark
●​ Most visited site
●​ Runs 1M+ servers
●​ Processes 1B+ search requests/day
●​ Generates 20 petabytes of data daily

🏗️ Google Data Center Strategy


Factor Priority

Cheap/Renewable Energy ✅ High


Low Latency Site Connections ✅ High
Peering with Internet Hubs ✅ High
Cooling Availability ✅ Medium
Large Land Purchase ✅ Medium
Tax Concessions ✅ Medium
🔄 How Google Uses Load Balancing
1.​ DNS Load Balancing (IP Virtualization)​

○​ Requests resolved to nearest datacenter.


○​ Uses round robin DNS.
2.​ Cluster-Level Load Balancing​

○​ Incoming traffic distributed across server racks.


3.​ Proxy Cache Layer (Squid Server)​

○​ Cached queries answered instantly.


4.​ Application Server Load Balancing​

○​ Real-time server utilization measured.

🧠 Google's "Secret Sauce"


●​ Inverted Index: Maps keywords to document IDs
●​ Page Rank: Determines importance of pages
●​ Data Compression: Efficient storage of “shards”
●​ Fault Tolerance: Automatically reassigns failed tasks

🧰 Other Google Services in Action


●​ Specialized Servers for calculations, reverse lookups
●​ AdSense / AdWords for monetization
●​ Spelling Servers for intelligent suggestions
🔹 Understanding Hypervisors in Cloud Computing
A hypervisor, also known as a Virtual Machine Monitor (VMM), is a low-level program that
enables the creation and management of virtual machines (VMs). It abstracts and isolates the
underlying physical hardware from the operating systems, allowing multiple VMs to run on a
single physical system.

📌 Purpose of Hypervisors
Hypervisors play a central role in virtualization, which is a foundational technology in cloud
computing. They allow cloud providers to:

●​ Run multiple operating systems on a single physical server.


●​ Dynamically allocate and manage resources.
●​ Improve server utilization and reduce costs.
●​ Enable workload isolation and mobility.

🔹 Types of Hypervisors
Hypervisors are primarily categorized into two types based on how they interact with hardware
and host operating systems.

✅ Type 1 Hypervisor (Bare-Metal)


●​ Installed directly on physical hardware.
●​ Does not require a host operating system.
●​ Provides better performance and efficiency.
●​ Commonly used in enterprise environments and cloud data centers.

✅ Type 2 Hypervisor (Hosted)


●​ Installed on top of an existing operating system (host OS).
●​ Suitable for desktop and development environments.
●​ Easier to install and use but with more overhead and lower performance.

🔽 Comparison of Type 1 and Type 2 Hypervisors


Feature Type 1 Hypervisor Type 2 Hypervisor

Installation Directly on hardware (bare-metal) On top of a host OS

Performance High (near-native) Moderate to low

Overhead Minimal Higher due to host OS

Use Case Data centers, servers, cloud infra Development, testing, personal use

Examples VMware ESXi, Microsoft Hyper-V VMware Workstation, VirtualBox,


(bare-metal), Xen, Oracle VM Parallels, KVM, Hyper-V (hosted)

Resource Direct hardware control Managed via host OS


Allocation

🔹 Types of Virtual Machines


Hypervisors support two major types of VMs:

VM Type Description

System Virtual Emulates an entire hardware system with its own OS and
Machine applications.

Process Virtual Designed to run a single process or application (e.g., JVM, .NET
Machine CLR).

🔹 Virtualization Techniques
Hypervisors implement different virtualization methods to manage guest operating systems:

✅ Full Virtualization
●​ Emulates the complete hardware environment.
●​ Guest OS runs without modification.
●​ Allows running multiple OS types on the same hardware.
●​ Common in Type 1 hypervisors.

✅ Paravirtualization
●​ Guest OS is modified to interact with the hypervisor via an API (para-API).
●​ Requires support from both the host and guest OS.
●​ Offers better performance than full virtualization.

✅ Emulation
●​ Software completely simulates hardware.
●​ Guest OS does not need to match host hardware.
●​ Useful for cross-platform compatibility.
●​ Typically slower due to overhead.
Virtualization Guest OS Performance Use Case
Type Modification

Full Virtualization Not required Moderate to General-purpose virtualization


high

Paravirtualization Required High Cloud systems needing


optimized I/O

Emulation Not required Low Legacy system support,


testing

🔹 Hypervisor in Cloud Computing


In cloud platforms, hypervisors enable:
●​ Resource isolation and multi-tenancy.
●​ Dynamic provisioning and cloning of VMs.
●​ Support for failover, load balancing, and replication.
●​ Efficient management through virtual infrastructure tools.

For example, Amazon Web Services (AWS) uses Xen and KVM hypervisors for their Amazon
Machine Instances (AMIs), while Microsoft Azure uses Hyper-V.

🔹 Operating System Virtualization


Apart from hardware-level virtualization, some OSes support OS-level virtualization, also
known as container-based virtualization.

●​ Creates virtual environments (VEs) or virtual private servers (VPS).


●​ All VEs share the same kernel.
●​ Lightweight and allows higher density of instances.
●​ Examples: Solaris Zones, IBM AIX Workload Partitions (WPARs), Docker (Linux
containers).

Feature OS-Level Virtualization Hypervisor-Based Virtualization

Kernel Sharing Shared Separate per VM

Overhead Low Higher

Isolation Moderate Strong

Performance High Moderate to High

Use Case Microservices, VMs, legacy OS support


containers
🔹 VMware vSphere:
🔸 What is VMware vSphere?
VMware vSphere is a cloud computing virtualization platform developed by VMware. It
serves as the foundation for building and managing virtualized data centers. In essence,
vSphere abstracts and pools hardware resources—compute, storage, and networking—and
provides tools to manage these resources effectively in a cloud environment.

vSphere is the successor to VMware Infrastructure and includes both infrastructure services
(like ESXi hypervisor and vCenter Server) and application services (like High Availability,
DRS, etc.).

🔸 Core Components of VMware vSphere


1.​ VMware ESXi:​

○​ A Type 1 hypervisor that installs directly on physical hardware (bare metal).


○​ Boots with a Linux kernel initially but loads the vmkernel first, which handles
virtualization tasks.
○​ Allows multiple virtual machines (VMs) to run on a single physical machine.
2.​ vCenter Server:​

○​ A centralized management console used to provision, manage, and monitor


vSphere environments.
○​ Enables cluster management, performance tuning, automation, and alerting.
3.​ VMFS (Virtual Machine File System):​

○​ A clustered file system optimized for storing virtual machine disk images.
○​ Supports concurrent access by multiple ESXi hosts.
4.​ VMotion:​

○​ Enables live migration of VMs from one physical server to another with zero
downtime.
○​ Maintains VM state and memory contents during transfer.
5.​ Storage VMotion:​

○​ Moves a VM’s virtual disks from one datastore to another while the VM
remains active.
6.​ vNetwork Distributed Switch (DVS):​

○​ Creates and manages virtual network configurations across multiple hosts.


○​ Supports advanced features like firewall, load balancing, and integration with
third-party switches like Cisco Nexus 1000V.
7.​ DRS (Distributed Resource Scheduler):​

○​ Automatically balances workloads by moving VMs between hosts based on CPU


and memory usage.
○​ Can include Distributed Power Management (DPM) to reduce power usage
during low loads.
8.​ Virtual SMP (Symmetric Multi-Processing):​

○​ Allows a VM to utilize multiple physical CPUs, improving performance for


compute-intensive workloads.
9.​ vCompute, vStorage, vNetwork Services:​

○​ Abstract physical resources into pools:


■​ vCompute: CPU and RAM
■​ vStorage: Disk and file systems
■​ vNetwork: Virtual switches, VLANs, and NICs

🔸 vSphere Architecture (Conceptual Overview)


A typical vSphere environment includes:

●​ Multiple physical hosts running ESXi


●​ A shared storage system (SAN, NAS, iSCSI, etc.)
●​ A management server (vCenter)
●​ Virtual Machines deployed on hosts, managed in resource pools
●​ Datastores that act as shared storage for VM files

These VMs can be dynamically moved and scaled according to business needs without being
tied to a specific piece of hardware.

🔸 Storage and Network Virtualization


Storage Virtualization:

●​ Involves creating logical representations of physical storage devices.


●​ ESXi maps a logical unit (LUN) to a Logical Block Address (LBA), effectively
abstracting storage.
●​ Enables features like Storage VMotion and thin provisioning.

Network Virtualization:

●​ Uses virtual NICs (vNICs) and virtual switches to mimic physical network interfaces.
●​ Allows network policies (like security, QoS) to be enforced virtually.
●​ External virtualization can include VLANs and network hardware abstraction using
software-defined networking (SDN) principles.

🔸 Key Advantage: Flexibility and Speed


●​ Rapid Deployment: New VMs can be spun up in seconds using pre-defined templates.
●​ Scalability: Easily scale up by adding hosts or VMs.
●​ Resiliency: HA and DRS provide failover and automatic load balancing.

🔹 Understanding Machine Imaging in Cloud Computing


🔸 What is Machine Imaging?
Machine Imaging is the process of creating a snapshot or clone of a virtual machine (VM),
including its operating system, applications, configurations, and data. The image serves as
a template for rapidly deploying multiple instances of identical environments.

In cloud computing, this is often referred to as a server image, machine image, or VM image.

🔸 Why Machine Imaging is Important


✅ Rapid Deployment: Deploy new VMs instantly using pre-configured images.
✅ Consistency: Ensure all instances have identical environments—eliminates
●​
●​

✅ Scalability: Easily scale up services by launching more instances from the same
configuration drift.
●​
image.
✅ Disaster Recovery: Recover systems quickly using stored machine images.
✅ Automation: Integral part of DevOps and Infrastructure-as-Code (IaC).
●​
●​

🔸 Key Terms
Term Explanation

Image A read-only template of a system's disk.

Snapshot A point-in-time copy of a VM’s state, including memory and


disk.

Golden Image A fully configured, secured, and tested image used as a


base template.

AMI (Amazon Machine AWS-specific image format used to launch EC2 instances.
Image)

Custom Image A user-created image tailored to a specific use case or app.

🔸 Components of a Machine Image


A complete machine image includes:

📦 Operating System (e.g., Linux, Windows)


⚙️ System Configurations (e.g., registry settings, network configs)
●​

🧩 Installed Software and Services


●​

🔐 Security Settings (firewall rules, user permissions)


●​

🧾 Startup Scripts or Metadata (for initialization tasks)


●​
●​
🔸 How Machine Imaging Works (General Steps)
1.​ Configure a VM: Install OS, configure system, deploy applications.
2.​ Stop the VM (optional): Ensures consistency during image creation.
3.​ Create Image:
○​ On AWS: Use Create Image to make an AMI.
○​ On VMware: Use Clone to Template or Export OVF.
4.​ Store Image: Stored in object storage or image registries (like AWS S3, Azure Blob, or
Docker registry).
5.​ Launch Instances: Use the image to spin up as many identical VMs as needed.

🔸 Types of Machine Images


Type Description Use Case

Base Image Clean OS install with minimal Start fresh with custom setup
configuration

Custom Includes specific software and Deploy app-ready environments


Image settings

Golden Secured, patched, tested image Enterprise deployments at scale


Image

Vendor Provided by cloud providers or 3rd Standardized environments (e.g.,


Image parties LAMP stack)

🔸 Machine Imaging in Different Platforms


✅ AWS
●​ Uses Amazon Machine Images (AMIs)
●​ Each AMI includes:
○​ One or more EBS snapshots (for volumes)
○​ Launch permissions
○​ Block device mapping

✅ Azure
●​ Uses Managed Images and Shared Image Gallery
●​ Support for image versioning, regions, and replication

✅ Google Cloud
●​ Uses Custom Images
●​ Can be stored and used in multiple regions

✅ VMware
●​ Create VM templates or OVF (Open Virtualization Format) exports
●​ Used in vCenter to deploy cloned VMs or deploy via automation

🔸 Best Practices
✅ Use golden images for production environments.
✅ Automate image creation with scripts (e.g., Packer).
●​

✅ Keep images updated with security patches.


●​

✅ Avoid hardcoding sensitive information in images.


●​

✅ Use version control for managing image changes.


●​
●​

🔸 Real-World Example
Suppose you’re deploying a web app that runs on Ubuntu with Apache, MySQL, and PHP.
Rather than configuring each server manually:

1.​ You set up the full stack once on a VM.


2.​ Create a golden image.
3.​ Launch 10 more VMs using that image—each one is production-ready in minutes.
4.​ Update the image when new security patches or app versions are released.
🔹 Capacity Planning in Cloud Environments
🔸 What is Capacity Planning?
Capacity Planning is the process of predicting and managing the computing resources
(like CPU, memory, storage, and network bandwidth) needed by applications or systems to
handle current and future workloads efficiently and cost-effectively.

In cloud computing, capacity planning ensures that your resources are:

🔄 Scalable on demand
💰 Cost-optimized
●​

⚙️ Aligned with performance and availability requirements


●​
●​

🔸 Why is Capacity Planning Important in the Cloud?


Benefit Description

⚡ Scalability Ensures resources are enough to handle peak loads without


overprovisioning

💵 Cost Avoids paying for unused resources


Efficiency

📈 Performance Maintains application performance during traffic spikes

🔒 Reliability Prevents downtime or system crashes due to resource shortages

📊 Forecasting Helps plan for future growth and expansion


🔸 Key Concepts in Cloud Capacity Planning
Term Description

Provisioning Allocating resources based on current or expected need

Over-Provisioning Allocating more resources than required (wasteful)

Under-Provisionin Allocating fewer resources than needed (leads to performance issues)


g

Elasticity Ability to scale resources up/down automatically

Auto Scaling Cloud feature to automatically adjust resource capacity

Utilization Metrics CPU, memory, disk, and network usage statistics used for
decision-making

🔸 Capacity Planning Process (Step-by-Step)


1.​ Understand Application Requirements​

○​ Identify workload patterns (e.g., constant, bursty, seasonal)


○​ Know your app’s CPU, memory, storage, and network needs
2.​ Collect Historical Data​

○​ Analyze resource usage over time (via monitoring tools)


○​ Track trends in user growth, transaction volume, etc.
3.​ Forecast Future Demands​

○​ Use predictive analytics or linear projections


○​ Consider upcoming features or events that may cause traffic spikes
4.​ Define SLAs & Performance Targets​

○​ E.g., 99.99% uptime, response time < 2 seconds


5.​ Select Right Instance Types & Services​

○​ Choose appropriate compute instances (e.g., EC2, Azure VMs)


○​ Consider managed services (e.g., RDS, Lambda)
6.​ Implement Auto-Scaling Policies​

○​ Set rules to add/remove instances based on metrics (CPU > 70%, etc.)
7.​ Continuously Monitor and Adjust​

○​ Use tools like AWS CloudWatch, Azure Monitor, Google Cloud Operations Suite
○​ Adapt based on real-time and predictive metrics

🔸 Tools for Capacity Planning


Platform Tools

AWS CloudWatch, Trusted Advisor, Cost Explorer, Compute Optimizer

Azure Azure Monitor, Advisor, Cost Management + Billing

GCP Operations Suite (Stackdriver), Recommender API

VMware vRealize Operations Manager

🔸 Common Challenges
Challenge Impact
Inaccurate Forecasting Leads to over- or under-provisioning

Ignoring Performance Trends Causes bottlenecks or slowdowns

Static Resource Allocation Doesn’t adapt to changing demand

Lack of Monitoring Misses real-time issues

🔸 Example Scenario
Let’s say you're running an e-commerce website. During regular days, 4 VMs are enough. But
during a festival sale:

●​ Traffic spikes by 4×
●​ You need to scale up to 16 VMs
●​ After the sale, scale back to 4 VMs

With proper capacity planning:

●​ You forecast this pattern based on previous years


●​ Use Auto Scaling to automatically handle the load
●​ Monitor metrics in real time to adjust thresholds

🔸 Best Practices
✅ Use Auto Scaling and Elastic Load Balancing
✅ Perform load testing before major events
●​

✅ Maintain a buffer margin (usually 10–20%)


●​

✅ Use cost calculators to estimate spend


●​

✅ Periodically review and adjust capacity plans


●​
●​

🔹 Defining Baselines and Metrics in Cloud Monitoring


Monitoring in cloud computing ensures that cloud resources are functioning efficiently, securely,
and within expected performance ranges. Two core elements in monitoring are metrics and
baselines. Metrics provide raw data points, and baselines define what values are considered
“normal.” Together, they enable effective monitoring, troubleshooting, and optimization.

🔸 What Are Metrics?


Metrics are numerical measurements collected from cloud resources. They indicate system
health, performance, and usage patterns.

📌 Common Types of Metrics


Type of Metric Description Examples

System Metrics Track hardware and infrastructure CPU usage, memory usage,
performance disk I/O

Application Monitor software/application Request latency, error rate, API


Metrics performance throughput

Business Metrics Reflect operational or business-related Transactions per second, active


indicators users

Custom Metrics User-defined metrics for specific needs Queue depth, job processing
time

Cloud platforms like AWS CloudWatch, Azure Monitor, and Google Cloud Monitoring collect
both default and custom metrics for analysis and visualization.

🔸 What Is a Baseline?
A baseline is a reference pattern or average measurement that reflects “normal” system
behavior over time. It acts as a benchmark for comparing real-time data to detect anomalies or
abnormal performance.

📌 Key Characteristics of a Baseline


Aspect Explanation

Normal Range The acceptable range of a metric under typical conditions

Time-Dependent Baselines vary by time of day, week, or season (e.g., higher usage on
Mondays)

Dynamic or Baselines can be fixed (static) or adapt over time using machine learning
Static

Data-Driven Built using historical data and analysis of trends

For instance, if average CPU usage during peak hours is consistently 60–70%, that range
becomes the CPU baseline for those hours.

🔸 Why Baselines and Metrics Matter


Defining accurate baselines and tracking relevant metrics enables proactive monitoring and
efficient incident response. Without them, teams may either overlook genuine issues or react
to normal variations unnecessarily.

📌 Benefits of Using Baselines and Metrics


Benefit Explanation

Anomaly Detection Identify unusual spikes or drops in performance metrics


Performance Optimization Spot bottlenecks and tune systems for better efficiency

Capacity Planning Use trends to forecast resource needs and scale appropriately

SLA Monitoring Ensure services meet the agreed Service Level Agreements

Alert Configuration Set up alerts based on threshold breaches compared to


baselines
🔸 Example Scenario
Suppose a cloud-based e-commerce site sees the following typical CPU usage patterns:

●​ Weekdays (9 AM – 6 PM): CPU usage is around 40%–60%


●​ Weekends: CPU usage drops to 20%–30%

These observed ranges become baselines. If on a Tuesday afternoon the CPU spikes to 95%,
an alert is triggered — indicating a possible system overload or abnormal traffic.

You might also like