0% found this document useful (0 votes)
27 views

Cloud Architecture

mca

Uploaded by

Akhilesh Singh
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Cloud Architecture

mca

Uploaded by

Akhilesh Singh
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Cloud architecture

Cloud architecture refers to how various cloud technology


components, such as hardware, virtual resources, software
capabilities, and virtual network systems interact and connect to
create cloud computing environments. It acts as a blueprint that
defines the best way to strategically combine resources to build
a cloud environment for a specific business need.

Cloud architecture components

Cloud architecture components include:

 A frontend platform
 A backend platform
 A cloud-based delivery model
 A network (internet, intranet, or intercloud)

In cloud computing, frontend platforms contain the client


infrastructure—user interfaces, client-side applications, and the
client device or network that enables users to interact with and
access cloud computing services. For example, you can open
the web browser on your mobile phone and edit a Google Doc.
All three of these things describe frontend cloud architecture
components.

On the other hand, the back end refers to the cloud architecture
components that make up the cloud itself, including computing
resources, storage, security mechanisms, management, and
more.

Below is a list of the main backend components:

Application: The backend software or application the client is


accessing from the front end to coordinate or fulfill client
requests and requirements.

Service: The service is the heart of cloud architecture, taking


care of all the tasks being run on a cloud computing system. It
manages which resources you can access, including storage,
application development environments, and web applications.

Runtime cloud: Runtime cloud provides the environment


where services are run, acting as an operating system that
handles the execution of service tasks and management.
Runtimes use virtualization technology to create hypervisors
that represent all your services, including apps, servers, storage,
and networking.

Storage: The storage component in the back end is where data


to operate applications is stored. While cloud storage
options vary by provider, most cloud service providers offer
flexible scalable storage services that are designed to store and
manage vast amounts of data in the cloud. Storage may include
hard drives, solid-state drives, or persistent disks in server
bays.

Infrastructure: Infrastructure is probably the most commonly


known component of cloud architecture. In fact, you might have
thought that cloud infrastructure is cloud architecture. However,
cloud infrastructure comprises all the major hardware
components that power cloud services, including the CPU,
graphics processing unit (GPU), network devices, and other
hardware components needed for systems to run smoothly.
Infrastructure also refers to all the software needed to run and
manage everything.

Cloud architecture, on the other hand, is the plan that dictates


how cloud resources and infrastructure are organized.

Management: Cloud service models require that resources be


managed in real time according to user requirements. It is
essential to use management software, also known as
middleware, to coordinate communication between the backend
and frontend cloud architecture components and allocate
resources for specific tasks. Beyond middleware, management
software will also include capabilities for usage monitoring, data
integration, application deployment, and disaster recovery.

Security: As more organizations continue to adopt cloud


computing, implementing cloud security features and tools is
critical to securing data, applications, and platforms. It’s
essential to plan and design data security and network security
to provide visibility, prevent data loss and downtime, and ensure
redundancy. This may include regular backups, debugging, and
virtual firewalls.

How does cloud architecture work?


In cloud architecture, each of the components works together to
create a cloud computing platform that provides users with on-
demand access to resources and services.

The back end contains all the cloud computing resources,


services, data storage, and applications offered by a cloud
service provider. A network is used to connect the frontend and
backend cloud architecture components, enabling data to be
sent back and forth between them. When users interact with the
front end (or client-side interface), it sends queries to the back
end using middleware where the service model carries out the
specific task or request.

The types of services available to use vary depending on the


cloud-based delivery model or service model you have chosen.
There are three main cloud computing service models:
 Infrastructure as a service (IaaS): This model provides
on-demand access to cloud infrastructure, such as servers,
storage, and networking. This eliminates the need to
procure, manage, and maintain on-premises infrastructure.
 Platform as a service (PaaS): This model offers a
computing platform with all the underlying infrastructure
and software tools needed to develop, run, and manage
applications.
 Software as a service (SaaS): This model offers cloud-
based applications that are delivered and maintained by the
service provider, eliminating the need for end users to
deploy software locally.

Cloud architecture layers

A simpler way of understanding how cloud architecture works is


to think of all these components as various layers placed on top
of each other to create a cloud platform.

Here are the basic cloud architecture layers:


1. Hardware: The servers, storage, network devices, and
other hardware that power the cloud.
2. Virtualization: An abstraction layer that creates a virtual
representation of physical computing and storage
resources. This allows multiple applications to use the same
resources.
3. Application and service: This layer coordinates and
supports requests from the frontend user interface, offering
different services based on the cloud service model, from
resource allocation to application development tools to web-
based applications.

Types of cloud architecture


Cloud adoption is not one-size-fits-all. You’ll need to consider
what type of cloud you want to build based on your existing
technology investments, your specific business requirements,
and the overall goals you hope to achieve.

There are three main types of cloud architecture you can choose
from: public, private, and hybrid.

Public cloud architecture uses cloud computing resources and


physical infrastructure that is owned and operated by a third-party
cloud service provider. Public clouds enable you to scale resources
easily without having to invest in your own hardware or software,
but use multi-tenant architectures that serve other customers at
the same time.

Private cloud architecture refers to a dedicated cloud that is


owned and managed by your organization. It is privately hosted
on-premises in your own data center, providing more control over
resources and more security over data and infrastructure.
However, this architecture is considerably more expensive and
requires more IT expertise to maintain.
Hybrid cloud architecture uses both public and private cloud
architecture to deliver a flexible mix of cloud services. A hybrid
cloud allows you to migrate workloads between environments,
allowing you to use the services that best suit your business
demands and the workload. Hybrid cloud architectures are often
the solution of choice for businesses that need control over their
data but also want to take advantage of public cloud offerings.

In recent years, multicloud architecture is also emerging as


more organizations look to use cloud services from multiple
cloud providers. Multicloud environments are gaining popularity
for their flexibility and ability to better match use cases to
specific offerings, regardless of vendor.

What does a cloud architect do?

A cloud architect is an IT expert responsible for developing,


implementing, and managing an organization’s cloud
architecture. As cloud strategies continue to become more
complex, the skills and expertise of cloud architects are
becoming more vital for helping companies navigate the
complexities of cloud environments, implement successful
strategies, and keep cloud systems running smoothly.

The Five Levels of Implementing Virtualization


Virtualization is not that easy to implement. A computer runs
an OS that is configured to that particular hardware. Running a
different OS on the same hardware is not exactly feasible.
To tackle this, there exists a hypervisor. What hypervisor does
is, it acts as a bridge between virtual OS and hardware to enable
its smooth functioning of the instance.

There are five levels of virtualizations available that are most


commonly used in the industry. These are as follows:

1.Instruction Set Architecture Level (ISA)

In ISA, virtualization works through an ISA emulation. This is


helpful to run heaps of legacy code which was originally written
for different hardware configurations.

These codes can be run on the virtual machine through an ISA.

A binary code that might need additional layers to run can now
run on an x86 machine or with some tweaking, even on x64
machines. ISA helps make this a hardware-agnostic virtual
machine.

The basic emulation, though, requires an interpreter. This


interpreter interprets the source code and converts it to a
hardware readable format for processing.

2.Hardware Abstraction Level (HAL)

As the name suggests, this level helps perform virtualization at


the hardware level. It uses a bare hypervisor for its functioning.

This level helps form the virtual machine and manages the
hardware through virtualization.

It enables virtualization of each hardware component such as I/O


devices, processors, memory, etc.

This way multiple users can use the same hardware with
numerous instances of virtualization at the same time.

IBM had first implemented this on the IBM VM/370 back in 1960.
It is more usable for cloud-based infrastructure.
Thus, it is no surprise that currently, Xen hypervisors are using
HAL to run Linux and other OS on x86 based machines.

3.Operating System Level

At the operating system level, the virtualization model creates


an abstract layer between the applications and the OS.

It is like an isolated container on the physical server and


operating system that utilizes hardware and software. Each of
these containers functions like servers.

When the number of users is high, and no one is willing to share


hardware, this level of virtualization comes in handy.

Here, every user gets their own virtual environment with


dedicated virtual hardware resources. This way, no conflicts
arise.

4.Library Level

OS system calls are lengthy and cumbersome. Which is why


applications opt for APIs from user-level libraries.

Most of the APIs provided by systems are rather well


documented. Hence, library level virtualization virtualization is
preferred in such scenarios.

Library interfacing virtualization is made possible by API hooks.


These API hooks control the communication link from the system
to the applications.

Some tools available today, such as vCUDA and WINE, have


successfully demonstrated this technique.

5.Application Level

Application-level virtualization comes handy when you wish


to virtualize only an application. It does not virtualize an entire
platform or environment.
On an operating system, applications work as one process.
Hence it is also known as process-level virtualization.

It is generally useful when running virtual machines with high-


level languages. Here, the application sits on top of
the virtualization layer, which is above the application program.

The application program is, in turn, residing in the operating


system.

Programs written in high-level languages and compiled for an


application-level virtual machine can run fluently here.

Virtualization Structures, Tools, and Mechanisms


Virtualization Structures

1. Hypervisors:
o Type 1 (Bare-Metal) Hypervisors: These hypervisors run directly on
the physical hardware without a host operating system. They are
used in enterprise environments for their high performance and
efficiency. Examples include VMware ESXi and Microsoft Hyper-V.
o Type 2 (Hosted) Hypervisors: These hypervisors run on top of a host
operating system and are typically used in desktop or development
environments. Examples include Oracle VirtualBox and VMware
Workstation.

2. Containers:
o Containers provide lightweight virtualization by packaging
applications and their dependencies together. They share the host OS
kernel but run in isolated user spaces. Docker is the most popular
containerization tool, and Kubernetes is commonly used for
orchestrating containers.

3. Virtual Machines (VMs):


o VMs simulate entire physical computers, including the CPU, memory,
and storage. Each VM runs its own operating system, which can be
different from the host OS. Hypervisors manage the creation and
execution of VMs.

Virtualization Tools
1. Management Tools:
o VMware vCenter: Centralized management of VMware vSphere
environments, allowing for efficient control of multiple virtual
machines and hosts.
o Microsoft System Center Virtual Machine Manager (SCVMM):
Manages Hyper-V environments, providing tools for deploying,
configuring, and managing VMs.

2. Orchestration Tools:
o Kubernetes: Manages the deployment, scaling, and operations of
containerized applications. It automates the distribution and
scheduling of containers across a cluster.
o OpenStack: An open-source cloud platform that controls large pools
of compute, storage, and networking resources.

3. Automation Tools:
o Ansible: Automates IT operations, including configuration
management, application deployment, and task automation.
o Terraform: Allows for infrastructure as code, enabling the creation,
management, and updating of infrastructure resources in a
repeatable manner.
4. Monitoring Tools:
o Prometheus: Collects and stores metrics from applications and
infrastructure, providing powerful querying and alerting capabilities.
o Nagios: Monitors systems, networks, and infrastructure, alerting
administrators to potential issues.

Virtualization Mechanisms
1. Hardware Virtualization Extensions:
o Intel VT-x and AMD-V: These CPU extensions improve virtualization
performance by offloading certain virtualization tasks to the
hardware, reducing overhead.

2. Paravirtualization:
o Paravirtualization modifies the guest OS to communicate directly
with the hypervisor, enhancing performance by reducing the need for
full hardware emulation. The Xen hypervisor uses paravirtualization
techniques.

3. Emulation:
o Emulation simulates hardware so that software designed for one
type of hardware can run on another. QEMU is a popular emulator
that can simulate different CPU architectures.

4. Snapshotting:
o Snapshotting captures the state of a VM or container at a specific
point in time. This is useful for backups, recovery, and testing changes
without permanent alterations.

5. Live Migration:
o Live migration involves moving a running VM or container from one
physical host to another with minimal downtime. Technologies like
VMware vMotion and KVM support live migration.

6. Virtual Networks:
o Software-Defined Networking (SDN): Decouples the network control
plane from the data plane, allowing for more flexible and dynamic
network management. OpenFlow and VMware NSX are examples of
SDN technologies.
o Virtual LANs (VLANs): Segregate network traffic in virtualized
environments, providing security and improving network
performance.

7. Storage Virtualization:
o Storage Area Networks (SAN): Consolidate storage resources into a
single pool, making them easier to manage and allocate. VMware
vSAN and NetApp ONTAP are examples of storage virtualization
solutions.

CPU virtualization
CPU virtualization allows multiple operating systems and applications to
run on a single physical machine by sharing the CPU's resources. The
primary component managing this is the hypervisor, which can be of two
types: Type 1 (bare-metal) and Type 2 (hosted). Type 1 hypervisors, like
VMware ESXi and Microsoft Hyper-V, run directly on the hardware,
providing high performance and efficiency. Type 2 hypervisors, such as
Oracle VirtualBox, run on top of an existing operating system, suitable for
desktop and development environments.

Each virtual machine (VM) is allocated one or more virtual CPUs (vCPUs),
which the hypervisor schedules and manages. Techniques used in CPU
virtualization include full virtualization, where the hypervisor completely
simulates the hardware, allowing unmodified guest operating systems to
run. Paravirtualization involves modifying the guest OS to interact more
efficiently with the hypervisor, improving performance.

Modern CPUs have hardware-assisted virtualization features like Intel VT-


x and AMD-V, which enhance virtualization by offloading some tasks to
the CPU, reducing the overhead. The hypervisor also handles context
switching, saving and restoring the state of each vCPU, and employs
mechanisms like trap-and-emulate to manage privileged instructions and
shadow page tables for efficient memory management.

Overall, CPU virtualization improves resource utilization, allows for


workload consolidation, and provides the ability to run multiple isolated
environments on a single physical machine, making it essential for modern
data centers and cloud computing.
Memory Virtualization
Memory virtualization is a technique that allows the abstraction and efficient
management of physical memory resources in a virtualized environment. It
enables multiple virtual machines (VMs) to share the physical memory of a
single host system, providing each VM with the illusion of having its own
dedicated memory.

Key Concepts

1. Virtual Memory:
o Each VM operates with its own virtual memory space, which appears
to be contiguous and exclusive to that VM. The hypervisor or host
operating system maps these virtual memory addresses to the actual
physical memory addresses.

2. Hypervisor:
o The hypervisor is responsible for managing memory allocation and
ensuring isolation between VMs. It handles memory allocation,
mapping, and swapping between physical and virtual memory.

Techniques

1. Paging:
o Memory is divided into fixed-size pages, and the hypervisor maps
these pages from virtual memory to physical memory. This allows for
efficient use of memory and easier management of memory
allocation.

2. Memory Overcommitment:
o The hypervisor can allocate more virtual memory to VMs than the
available physical memory. This is possible because not all VMs use
their allocated memory simultaneously. Techniques like ballooning
and swapping help manage overcommitted memory.

3. Ballooning:
o A balloon driver within the guest OS can "inflate" to consume
memory, which the hypervisor can then reclaim and allocate to other
VMs that need it more.

4. Swapping:
o When physical memory is exhausted, the hypervisor can move less
frequently used data to disk storage (swap space) to free up physical
memory for more active VMs.

5. Transparent Page Sharing (TPS):


o The hypervisor identifies identical memory pages used by different
VMs and consolidates them into a single shared page. This reduces
memory redundancy and improves efficiency.

Tools and Mechanisms

1. Hypervisors:
o VMware ESXi: Uses techniques like TPS and memory compression to
optimize memory usage.
o Microsoft Hyper-V: Manages memory allocation dynamically,
allowing for efficient utilization and high performance.
o KVM (Kernel-based Virtual Machine): An open-source hypervisor
that leverages Linux kernel features for memory management.

2. Memory Management Unit (MMU):


o The MMU is a hardware component that translates virtual addresses
to physical addresses. Virtualization extensions in modern CPUs (like
Intel EPT and AMD RVI) enhance MMU performance, reducing the
overhead of memory virtualization.
3. Page Tables:
o Page tables store the mapping between virtual and physical memory
addresses. The hypervisor maintains shadow page tables or uses
nested paging to efficiently manage these mappings.

Benefits

1. Isolation:
o Each VM's memory is isolated from others, enhancing security and
stability. Memory virtualization ensures that one VM cannot directly
access another VM's memory.

2. Efficiency:
o Memory virtualization allows for more efficient use of physical
memory through techniques like paging, ballooning, and transparent
page sharing, reducing waste and improving performance.

3. Scalability:
o Memory virtualization supports the creation and management of
large numbers of VMs on a single physical host, making it essential
for cloud computing and data center operations.

I/O Devices Virtualization

I/O (Input/Output) devices virtualization allows multiple virtual machines (VMs)


to share and efficiently utilize physical I/O devices of a host system. It abstracts
and manages access to I/O resources, ensuring isolation, performance, and
flexibility for virtualized environments.
Key Concepts

1. Virtual I/O Devices:


o Each VM interacts with virtual representations of I/O devices (such as
virtual network adapters, virtual disk controllers) provided by the
hypervisor. These virtual devices are mapped to physical I/O
resources.

2. Direct Assignment:
o Some I/O devices can be directly assigned to VMs, bypassing the
hypervisor. This provides near-native performance but reduces
flexibility and management features.

Techniques

1. Device Emulation:
o The hypervisor emulates standard I/O devices that guest VMs
recognize. This ensures compatibility across different guest OSs and
hardware configurations.

2. Para-virtualization:
o Guest VMs use para-virtualized drivers provided by the hypervisor to
communicate with virtualized I/O devices, improving performance
and reducing overhead compared to full virtualization.

3. Direct I/O Passthrough:


o Also known as PCI passthrough, this technique assigns a physical I/O
device directly to a VM. It bypasses the hypervisor's emulation or
virtualization layer, offering near-native performance.

4. Virtual I/O Acceleration (VIA):


o Enhances I/O performance by optimizing data transfer between VMs
and physical I/O devices. Techniques like batching, caching, and
offloading are used to reduce latency and improve throughput.

Tools and Mechanisms

1. Hypervisors:
o VMware ESXi: Supports various I/O virtualization techniques like
para-virtualization and direct I/O passthrough.
o Microsoft Hyper-V: Provides capabilities for virtualizing network
adapters, storage controllers, and other I/O devices.
2. Virtual I/O Controllers:
o These are software-based controllers provided by the hypervisor,
managing access and communication between VMs and physical I/O
devices.

3. I/O Memory Management Unit (IOMMU):


o Hardware technology (e.g., Intel VT-d, AMD-Vi) that enhances I/O
virtualization by allowing direct access to physical memory and I/O
devices, improving performance and security.

Benefits

1. Resource Efficiency:
o I/O virtualization optimizes the use of physical I/O devices, enabling
multiple VMs to share them without compromising performance or
security.

2. Isolation:
o Each VM's access to I/O resources is isolated, ensuring that one VM
cannot interfere with or access another VM's I/O operations.

3. Flexibility:
o Virtualizing I/O devices provides flexibility in managing and scaling
virtualized environments, supporting dynamic resource allocation
and workload migration.

A virtual cluster refers to a logical grouping of virtual machines (VMs) or


containers that collectively function as a unified computing resource. Unlike
traditional physical clusters, which are composed of interconnected physical
servers, virtual clusters leverage virtualization technology to pool together
computing resources from a single physical host or multiple hosts across a
network. Here are key aspects and benefits of virtual clusters:

Virtual Clusters and Resource Mangement


Key Aspects of Virtual Clusters

1. Resource Pooling: Virtual clusters pool together CPU, memory, storage,


and network resources from underlying physical infrastructure. This
pooling allows for efficient utilization and management of resources across
multiple VMs or containers.
2. Isolation: Each VM or container within a virtual cluster operates
independently and is isolated from others. This isolation ensures that
applications and data running on one VM do not interfere with those
running on others, enhancing security and stability.
3. Flexibility and Scalability: Virtual clusters provide flexibility to scale
resources up or down based on workload demands. Administrators can add
or remove VMs or containers dynamically to meet changing application
requirements without being constrained by physical hardware limitations.
4. Virtual Networking: Virtual clusters often include virtual networks that
facilitate communication between VMs or containers within the cluster.
These networks can be configured to provide different levels of isolation
and security, ensuring efficient data transfer and communication.

Resource Management
1. CPU and Memory Allocation:
o Allocation Policies: Define how CPU and memory resources are
distributed among VMs or containers within the cluster.
o Overcommitment: Techniques like memory ballooning and CPU
overcommitment allow more efficient utilization of physical
resources.

2. Storage Management:
o Virtual Storage: Manage virtual disks and storage volumes allocated
to each VM or container.
o Storage Virtualization: Pool storage resources and allocate
dynamically based on workload requirements.
3. Network Management:
o Virtual Networks: Create isolated networks for VMs or containers,
ensuring secure communication and efficient data transfer.
o Bandwidth Management: Allocate network bandwidth to prioritize
traffic and optimize performance.

4. Load Balancing:
o Traffic Distribution: Distribute incoming network traffic or workload
across multiple VMs or containers to optimize resource usage and
ensure high availability.
o Application-Level Load Balancers: Route requests based on
application-specific criteria to improve performance and reliability.

5. Monitoring and Optimization:


o Resource Monitoring: Continuously monitor CPU utilization, memory
usage, network traffic, and storage performance to identify
bottlenecks and optimize resource allocation.
o Auto-scaling: Automatically adjust resources based on predefined
metrics or thresholds to maintain performance and efficiency during
varying workloads.

Benefits

1. Scalability: Easily scale applications and services by adding or removing


VMs or containers within the virtual cluster.
2. Resource Efficiency: Optimize resource utilization across VMs or
containers to maximize performance and reduce costs.
3. High Availability: Ensure application availability and reliability through
redundancy and load balancing within the virtual cluster.

Virtualization in a Data Automation Center refers to the use of virtualization


technologies to optimize and manage data processing, storage, and automation
tasks within a centralized environment. Here’s how virtualization can be
leveraged in such a setup:

Virtualization in Data Automation Center


1. Server Virtualization:
o Consolidation: Virtualizing servers allows multiple virtual machines
(VMs) to run on a single physical server, reducing hardware costs and
space requirements.
o Resource Allocation: Allocating CPU, memory, and storage
resources dynamically based on workload demands improves
efficiency and scalability.
o High Availability: Implementing failover and redundancy
mechanisms through virtualization ensures continuous operation of
critical data automation processes.
2. Storage Virtualization:
o Pooling: Aggregating physical storage resources into virtual pools
simplifies management and improves utilization.
o Data Protection: Implementing snapshots, replication, and backup
solutions at the virtualization layer enhances data integrity and
disaster recovery capabilities.
o Performance Optimization: Utilizing features like tiered storage and
caching improves access speeds for frequently accessed data.
3. Network Virtualization:
o Isolation: Creating virtual networks with dedicated resources for
different automation tasks ensures security and prevents network
congestion.
o Software-Defined Networking (SDN): Automating network
configurations and policies through virtualization streamlines
deployment and management of network resources.
4. Desktop Virtualization:
o Centralized Management: Hosting virtual desktops for data
analysts, developers, and automation engineers centralizes
management and enhances security.
o Flexibility: Providing remote access to virtual desktops allows for
flexible working arrangements and improves collaboration on data
automation projects.
5. Application Virtualization:
o Compatibility: Virtualizing applications ensures compatibility across
different operating systems and hardware platforms, facilitating
integration within the automation center.
o Isolation: Running applications in isolated containers or VMs
enhances security and stability, minimizing potential disruptions to
data processing workflows.

Benefits of Virtualization in Data Automation Centers

 Cost Efficiency: Reducing hardware and maintenance costs through server


and storage virtualization.
 Scalability: Easily scaling resources up or down to accommodate changing
data processing requirements.
 Improved Performance: Optimizing resource utilization and leveraging
advanced features like caching and load balancing.
 Enhanced Security: Implementing isolation and access controls to protect
sensitive data and automation workflows.
 Operational Efficiency: Streamlining management tasks and automating
routine operations through virtualization technologies.

Use Cases

 Big Data Processing: Using virtualization to manage distributed


computing resources for processing large datasets.
 Automation Workflows: Integrating virtual machines and containers to
automate data ingestion, transformation, and analysis tasks.
 Cloud Integration: Extending data automation capabilities to hybrid or
multi-cloud environments through virtualization.

Unit-IV
features of cloud and grid platforms

Cloud Platforms

1. On-Demand Self-Service:
o Users can provision and manage computing resources (e.g., virtual
machines, storage) without human intervention from the service
provider.
2. Broad Network Access:
o Services are accessible over the network and can be accessed through
standard mechanisms, enabling diverse client devices (e.g., laptops,
smartphones) to use cloud services.
3. Resource Pooling:
o Computing resources are pooled to serve multiple consumers using a
multi-tenant model, with different physical and virtual resources
dynamically assigned and reassigned according to demand.
4. Rapid Elasticity:
o Resources can be scaled up or down quickly and automatically to
accommodate changes in demand. This elasticity provides the ability
to scale out during peak times and scale in during periods of low
demand.
5. Measured Service:
o Cloud systems automatically control and optimize resource use by
leveraging a metering capability at some level of abstraction
appropriate to the type of service (e.g., storage, processing,
bandwidth, active user accounts). Resource usage can be monitored,
controlled, and reported, providing transparency for both the provider
and consumer of the utilized service.
6. Examples:
o Amazon Web Services (AWS), Microsoft Azure, Google Cloud
Platform (GCP)

Grid Platforms

1. Resource Sharing and Coordination:


o Grid platforms focus on sharing computing resources across multiple
administrative domains to achieve a common goal, typically large-
scale computation or data-intensive tasks.
2. Distributed Computing:
o Grids enable distributed computing and parallel processing of tasks
across multiple nodes (often geographically dispersed) connected via
a high-speed network.
3. Virtual Organizations:
o Grid platforms support dynamic, virtual organizations that collaborate
to share resources, with mechanisms for authentication, authorization,
and resource access control.
4. Resource Management:
o Grids employ sophisticated resource management and scheduling
algorithms to optimize resource utilization, prioritize tasks, and
ensure efficient workload distribution.
5. Heterogeneity:
o Grids accommodate heterogeneous resources (different hardware,
operating systems, and network architectures) and integrate them into
a unified computing environment.
6. Examples:
o European Grid Infrastructure (EGI), Open Science Grid (OSG),
XSEDE (Extreme Science and Engineering Discovery
Environment)

 Difference between Cloud Computing and Grid


Computing
Cloud Computing Grid Computing

Cloud computing is a While it is a Distributed computing


Client-server architecture.
Cloud Computing Grid Computing

computing
architecture.

Cloud computing is a While grid computing is a


centralized executive. decentralized executive.

In cloud computing,
While in grid computing, resources
resources are used in
are used in collaborative pattern.
centralized pattern.

It is more flexible than While it is less flexible than cloud


grid computing. computing.

In cloud computing,
While in grid computing, the users do
the users pay for the
not pay for use.
use.

Cloud computing is a
While grid computing is a low
high accessible
accessible service.
service.

It is highly scalable as
While grid computing is low scalable
compared to grid
in comparison to cloud computing.
computing.

It can be accessed
While it is accessible through grid
through standard web
middleware.
protocols.

Cloud computing is
Grid computing is based on
based on service-
application-oriented.
oriented.

Cloud computing uses Grid computing uses service like


service like IAAS, distributed computing, distributed
PAAS, SAAS. pervasive, distributed information.
Difference between Cloud Computing and Distributed Computing

1. Cloud Computing :
Cloud computing refers to providing on demand IT
resources/services like server, storage, database, networking,
analytics, software etc. over internet. It is a computing
technique that delivers hosted services over the internet to its
users/customers. Cloud computing provides services such as
hardware, software, networking resources through internet.
Some characteristics of cloud computing are providing shared
pool of configurable computing resources, on-demand service,
pay per use, provisioned by the Service Providers etc.
It is classified into 4 different types such as
 Public Cloud
 Private Cloud
 Community Cloud
 Hybrid Cloud

2. Distributed Computing :
Distributed computing refers to solve a problem over
distributed autonomous computers and they communicate
between them over a network. It is a computing technique
which allows to multiple computers to communicate and work
to solve a single problem. Distributed computing helps to
achieve computational tasks more faster than using a single
computer as it takes a lot of time. Some characteristics of
distributed computing are distributing a single task among
computers to progress the work at same time, Remote
Procedure calls and Remote Method Invocation for distributed
computations.
It is classified into 3 different types such as
 Distributed Computing Systems
 Distributed Information Systems
 Distributed Pervasive Systems

Difference between Cloud Computing and Distributed


Computing :
S.No. CLOUD COMPUTING DISTRIBUTED COMPUTING
Cloud computing refers to
providing on demand IT Distributed computing refers to
resources/services like server, solve a problem over distributed
storage, database, networking, autonomous computers and they
analytics, software etc. over communicate between them over a
01. internet. network.
In simple cloud computing can In simple distributed computing
be said as a computing can be said as a computing
technique that delivers hosted technique which allows to multiple
services over the internet to its computers to communicate and
02. users/customers. work to solve a single problem.
It is classified into 3 different
It is classified into 4 different types such as Distributed
types such as Public Cloud, Computing Systems, Distributed
Private Cloud, Community Information Systems and
03. Cloud and Hybrid Cloud. Distributed Pervasive Systems.
There are many benefits of
cloud computing like cost There are many benefits of
effective, elasticity and reliable, distributed computing like
economies of Scale, access to the flexibility, reliability, improved
04. global market etc. performance etc.
Cloud computing provides Distributed computing helps to
services such as hardware, achieve computational tasks more
software, networking resources faster than using a single
05. through internet. computer as it takes a lot of time.
The goal of distributed computing
The goal of cloud computing is is to distribute a single task among
to provide on demand multiple computers and to solve it
computing services over internet quickly by maintaining
06. on pay per use model. coordination between them.
07. Some characteristics of cloud Some characteristics of
computing are providing shared distributed computing are
pool of configurable computing distributing a single task among
resources, on-demand service, computers to progress the work at
pay per use, provisioned by the same time, Remote Procedure
calls and Remote Method
Invocation for distributed
Service Providers etc. computations.
Some disadvantage of cloud
computing includes less control Some disadvantage of distributed
especially in the case of public computing includes chances of
clouds, restrictions on available failure of nodes, slow network
services may be faced and cloud may create problem in
08. security. communication.

What is Parallel Computing?


It is also known as parallel processing. It utilizes several
processors. Each of the processors completes the tasks that
have been allocated to them. In other words, parallel computing
involves performing numerous tasks simultaneously. A shared
memory or distributed memory system can be used to assist in
parallel computing. All CPUs in shared memory systems share
the memory. Memory is shared between the processors in
distributed memory systems.

Parallel computing provides numerous advantages. Parallel


computing helps to increase the CPU utilization and improve the
performance because several processors work simultaneously.
Moreover, the failure of one CPU has no impact on the other
CPUs' functionality. Furthermore, if one processor needs
instructions from another, the CPU might cause latency.

Advantages and Disadvantages of Parallel Computing

There are various advantages and disadvantages of parallel


computing. Some of the advantages and disadvantages are as
follows:

Advantages

1. It saves time and money because many resources working


together cut down on time and costs.
2. It may be difficult to resolve larger problems on Serial
Computing.
3. You can do many things at once using many computing
resources.
4. Parallel computing is much better than serial computing for
modeling, simulating, and comprehending complicated real-
world events.

Disadvantages

1. The multi-core architectures consume a lot of power.


2. Parallel solutions are more difficult to implement, debug,
and prove right due to the complexity of communication
and coordination, and they frequently perform worse than
their serial equivalents.

What is Distributing Computing?

It comprises several software components that reside on


different systems but operate as a single system. A distributed
system's computers can be physically close together and linked
by a local network or geographically distant and linked by
a wide area network (WAN). A distributed system can be
made up of any number of different configurations, such as
mainframes, PCs, workstations, and minicomputers. The main
aim of distributed computing is to make a network work as a
single computer.

There are various benefits of using distributed computing. It


enables scalability and makes it simpler to share resources. It
also aids in the efficiency of computation processes.

Advantages and Disadvantages of Distributed Computing

There are various advantages and disadvantages of distributed


computing. Some of the advantages and disadvantages are as
follows:

Advantages

1. It is flexible, making it simple to install, use, and debug new


services.
2. In distributed computing, you may add multiple machines
as required.
3. If the system crashes on one server, that doesn't affect
other servers.
4. A distributed computer system may combine the
computational capacity of several computers, making it
faster than traditional systems.

Disadvantages

1. Data security and sharing are the main issues in distributed


systems due to the features of open systems
2. Because of the distribution across multiple servers,
troubleshooting and diagnostics are more challenging.
3. The main disadvantage of distributed computer systems is
the lack of software support.

Difference between Parallel Computing and Distributed


Computing:

S.NO Parallel Computing Distributed Computing

Many operations are


System components are
1. performed
located at different locations
simultaneously

Single computer is
2. Uses multiple computers
required

Multiple processors
Multiple computers perform
3. perform multiple
multiple operations
operations

It may have shared or It have only distributed


4.
distributed memory memory

Processors Computer communicate with


5. communicate with each each other through message
other through bus passing.
Improves system scalability,
Improves the system
6. fault tolerance and resource
performance
sharing capabilities

Programming Support of Google App Engine

Google App Engine (GAE):

 Description: GAE is a Platform as a Service (PaaS) that allows developers


to build and deploy applications on Google’s infrastructure.
 Languages Supported: Python, Java, PHP, Go, Node.js, .NET, Ruby.
 Key Features:
o Automatic Scaling: Automatically adjusts the number of instances
based on the application's traffic.
o Managed Services: Integrated with Google Cloud services such as
Datastore, Cloud SQL, and Memcache.
o Flexible Environments:
 Standard Environment: Pre-configured runtime environments,
fast deployments.
 Flexible Environment: Custom runtime environments using
Docker containers, supports native libraries, background
processes.
o Security: Built-in security features, including HTTPS, authentication
via Google accounts, and integration with Google Cloud Identity and
Access Management (IAM).
o Monitoring and Logging: Integrated with Google Cloud’s
Stackdriver for monitoring, logging, and diagnostics.
o Development Tools: Google Cloud SDK, Cloud Shell, and a web-
based Cloud Console for managing applications.

Programming on Amazon AWS and Microsoft Azure

Amazon Web Services (AWS):

 Key Services for Programming:


o EC2 (Elastic Compute Cloud): Scalable virtual servers for running
applications.
o Lambda: Serverless computing, runs code in response to events,
automatically manages compute resources.
o S3 (Simple Storage Service): Scalable object storage for any amount
of data.
o RDS (Relational Database Service): Managed relational databases.
o DynamoDB: Fully managed NoSQL database.
 Programming Support:
o SDKs: AWS SDKs for multiple languages including Java, Python
(Boto3), JavaScript (Node.js), Ruby, PHP, .NET, and Go.
o Infrastructure as Code (IaC): AWS CloudFormation for defining
and provisioning infrastructure using templates.
o Development Tools: AWS Cloud9 (integrated development
environment), AWS CodePipeline (CI/CD), AWS CodeDeploy
(automated deployments), and AWS CodeBuild (build service).
 Deployment and Management:
o Elastic Beanstalk: Platform for deploying and scaling web
applications and services.
o AWS CLI (Command Line Interface): Command-line tools for
managing AWS services.
o AWS Management Console: Web-based user interface for managing
AWS resources.

Microsoft Azure:
 Key Services for Programming:
o Azure Virtual Machines: Scalable VMs for running applications and
workloads.
o Azure Functions: Serverless computing, executes code in response to
events.
o Azure Blob Storage: Scalable object storage for unstructured data.
o Azure SQL Database: Fully managed relational database service.
o Cosmos DB: Globally distributed, multi-model NoSQL database.
 Programming Support:
o SDKs: Azure SDKs for languages including .NET, Java, Node.js,
Python, Ruby, PHP, and Go.
o Infrastructure as Code (IaC): Azure Resource Manager (ARM)
templates for defining infrastructure as code.
o Development Tools: Visual Studio, Visual Studio Code, Azure
DevOps for CI/CD, and Azure CLI.
 Deployment and Management:
o Azure App Services: Platform for building and hosting web apps,
RESTful APIs, and mobile backends.
o Azure Portal: Web-based interface for managing Azure resources.
o Azure CLI: Command-line tools for managing Azure services.
Emerging Cloud Software Environments:

 Kubernetes:
o Description: Open-source platform for automating deployment,
scaling, and operations of application containers.
o Key Features: Container orchestration, automatic bin packing, self-
healing, service discovery and load balancing, automated rollouts and
rollbacks, secret and configuration management.
o Integration: Compatible with multiple cloud providers and on-
premises infrastructure.
 OpenStack:
o Description: Open-source cloud computing platform for building and
managing public and private clouds.
o Key Features: Modular architecture with components for compute
(Nova), storage (Swift, Cinder), networking (Neutron), and identity
(Keystone).
o Usage: Popular for private clouds, hybrid clouds, and as a basis for
public cloud services.
 Serverless Computing:
o Platforms: AWS Lambda, Azure Functions, Google Cloud
Functions.
o Description: Execution model where the cloud provider runs the
server, dynamically manages resource allocation, and charges only for
the time the code is running.
o Key Features: No server management, automatic scaling, built-in
fault tolerance, and pay-per-use pricing model.
o Use Cases: Event-driven applications, microservices, real-time data
processing, and backend services.

Storage Systems
Evolution of Storage Technology:

 Early Storage: Magnetic tapes, floppy disks, which were limited in


capacity and speed.
 Modern Storage: Hard Disk Drives (HDDs) and Solid State Drives
(SSDs), offering higher capacities, faster access times, and improved
reliability.
 Cloud Storage: Remote storage accessible over the internet, providing
virtually unlimited storage capacity and scalability.

Storage Models:
 Block Storage: Provides raw storage volumes that can be attached to
virtual machines. Commonly used for databases and applications requiring
low-latency access.
 File Storage: Manages data as files in a hierarchical structure. Suitable for
file sharing and storage of documents, images, and other files.
 Object Storage: Stores data as objects, each with a unique identifier. Ideal
for storing large amounts of unstructured data such as multimedia files,
backups, and logs.
File Systems and Databases☹ (short type)

 File Systems: NTFS, ext4, FAT32.


 Databases:
o SQL Databases: Structured Query Language databases such as MySQL,
PostgreSQL, Oracle.
o NoSQL Databases: Non-relational databases designed for unstructured data,
such as MongoDB, Cassandra, Dynamo
o DB.

Distributed File Systems:

 HDFS (Hadoop Distributed File System): Designed for high throughput and large
data sets, used in Hadoop.
 Ceph: A scalable, distributed storage system providing object, block, and file storage.

General Parallel File Systems:

 GPFS (General Parallel File System): IBM’s high-performance shared-disk file


system used in large-scale computing environments.

Specific Technologies:

1. Google File System (GFS): Designed for large-scale data processing, supports high
fault tolerance and handles large files.
2. Apache Hadoop:
o HDFS: Distributed file system for storing large data sets.
o MapReduce: Programming model for parallel processing of large data sets.
3. BigTable: A distributed storage system for managing structured data, designed to
scale to very large sizes.
4. Megastore: Google’s storage system that combines the scalability of NoSQL
databases with the consistency of traditional databases.
5. Amazon S3: Scalable object storage service offering high availability, durability, and
security.
File Systems and Databases
File Systems

1. NTFS (New Technology File System):


o Developed by: Microsoft.
o Features: Support for large files, file compression, encryption, disk
quotas, and enhanced security with ACL (Access Control Lists).
o Usage: Commonly used in Windows operating systems.

2. ext4 (Fourth Extended File System):


o Developed by: The Linux community.
o Features: Support for large files and volumes, journaling for
improved reliability, extents for better performance, backward
compatibility with ext3.
o Usage: Widely used in Linux distributions.

3. FAT32 (File Allocation Table 32):


o Developed by: Microsoft.
o Features: Simple and widely compatible file system, limited to 4GB
file size and 8TB volume size.
o Usage: Commonly used in USB flash drives, memory cards, and other
portable storage devices for compatibility across different operating
systems.

Databases

SQL Databases:

 MySQL:
o Description: Open-source relational database management system.
o Features: ACID compliance, support for various storage engines,
replication, partitioning, and strong security features.
o Usage: Web applications, data warehousing, and e-commerce
platforms.

 PostgreSQL:
o Description: Open-source relational database system known for its
robustness and advanced features.
o Features: ACID compliance, support for complex queries, JSON data
types, full-text search, and extensibility.
o Usage: Enterprise applications, geographic information systems (GIS),
and data analysis.

 Oracle:
o Description: Commercial relational database management system
known for its advanced features and performance.
o Features: ACID compliance, clustering, partitioning, advanced
security features, support for large databases, and extensive tools for
data management.
o Usage: Large enterprises, financial institutions, and mission-critical
applications.

NoSQL Databases:

 MongoDB:
o Description: Document-oriented NoSQL database.
o Features: Schema flexibility, horizontal scalability, high availability
through replication, and support for complex queries.
o Usage: Content management systems, real-time analytics, and
mobile applications.

 Cassandra:
o Description: Wide-column store NoSQL database designed for high
availability and scalability.
o Features: Decentralized architecture, linear scalability, fault
tolerance, and support for large volumes of data across multiple data
centers.
o Usage: Real-time big data applications, IoT, and social media
analytics.

 DynamoDB:
o Description: Fully managed NoSQL database service provided by
Amazon Web Services.
o Features: Automatic scaling, high availability, low latency, support for
key-value and document data models.
o Usage: Web applications, gaming, IoT applications, and serverless
computing.

Distributed File Systems

1. HDFS (Hadoop Distributed File System):


o Designed for: High throughput and large data sets.
o Features: Fault tolerance through replication, scalability to thousands
of nodes, optimized for sequential data access.
o Usage: Backbone of the Hadoop ecosystem, used for big data
analytics.

2. Ceph:
o Description: Scalable, distributed storage system providing object,
block, and file storage.
o Features: High performance, high availability, self-healing, and
automatic data distribution.
o Usage: Cloud infrastructure, data centers, and enterprises needing
scalable and reliable storage solutions.

General Parallel File Systems

1. GPFS (General Parallel File System):


o Developed by: IBM.
o Features: High-performance shared-disk file system, scalable,
supports large clusters, data striping, fault tolerance, and fine-
grained locking.
o Usage: Large-scale computing environments, high-performance
computing (HPC), and enterprise storage solutions.

Specific Technologies

1. Google File System (GFS):


o Designed for: Large-scale data processing.
o Features: High fault tolerance, handles large files, data replication for
reliability, optimized for large sequential reads and writes.
o Usage: Underpins Google’s data storage needs, used in distributed
applications such as search indexing and data mining.

2. Apache Hadoop:
o Components:
 HDFS: Distributed file system designed for storing large data
sets across multiple machines.
 MapReduce: Programming model for processing large data sets
with a distributed algorithm.
o Features: Scalability, fault tolerance, designed for batch processing.
o Usage: Big data analytics, data processing workflows, and data
warehousing.

3. BigTable:
o Description: Distributed storage system for managing structured
data, designed to scale to very large sizes.
o Features: Sparse, distributed, multi-dimensional sorted map,
optimized for high read and write throughput.
o Usage: Used by Google applications such as web indexing, Google
Earth, and Google Finance.

4. Megastore:
o Description: Google’s storage system combining the scalability of
NoSQL databases with the consistency of traditional databases.
o Features: Partitioned data storage, synchronous replication, support
for ACID transactions.
o Usage: Applications requiring high availability, strong consistency,
and scalability, such as Google App Engine applications.

5. Amazon S3 (Simple Storage Service):


o Description: Scalable object storage service provided by AWS.
o Features: High availability, durability, security, low latency, supports a
variety of use cases including backups, archiving, big data analytics,
and disaster recovery.
o Usage: Cloud storage for applications, data backup and recovery,
content distribution, and data lakes.

You might also like