Unit-II CC
Unit-II CC
Cloud computing offers vast benefits like scalability, flexibility, and cost-efficiency, but it
also comes with several challenges. Here are some of the main challenges:
Data Security: Storing data on cloud platforms raises concerns about data breaches,
unauthorized access, and leaks.
Compliance: Certain industries have strict data privacy regulations (e.g., GDPR,
HIPAA), making it challenging to ensure compliance when data is stored in multiple
jurisdictions.
Access Control: Managing and enforcing access control across cloud services can be
complex, increasing vulnerability to security risks.
Service Outages: Even the largest cloud providers experience occasional service
outages, which can disrupt access to critical applications.
Dependency on Provider: The reliability of cloud-based applications heavily
depends on the provider's infrastructure, which may not always meet uptime
guarantees.
Unpredictable Costs: Cloud costs can escalate quickly due to unexpected usage
spikes, making budgeting difficult.
Hidden Fees: Some cloud providers have complex pricing models with hidden fees
for data storage, retrieval, and other services, which can lead to unexpected costs.
Latency: Depending on the geographical location of users and data centers, latency
can affect application performance.
Data Transfer Costs: Moving data to and from the cloud can incur high costs,
especially with large data volumes or frequent transfers.
5. Vendor Lock-In
Limited Portability: Different cloud providers have unique APIs, tools, and services,
making it hard to migrate workloads between providers.
Dependency on Proprietary Tools: If a company heavily relies on a provider’s
proprietary tools, it may become locked into that ecosystem, limiting flexibility and
bargaining power.
6. Complexity in Multi-Cloud Management
8. Skill Gaps
Addressing these challenges involves strategic planning, choosing the right cloud provider,
implementing robust security measures, and ensuring ongoing cloud management and
optimization.
sql
Copy code
+--------------------------------+
| SaaS Provider |
| +---------------------------+ |
| | Application Layer | |
| +---------------------------+ |
| | Platform Layer | |
| +---------------------------+ |
| | Infrastructure Layer | |
+--------------------------------+
|
|
+----v----+
| User |
+----------+
sql
Copy code
+----------------------------------+
| Cloud Storage Provider |
| +------------------------------+ |
| | Data Storage Layer | |
| +------------------------------+ |
| | Access Control Layer | |
+----------------------------------+
|
|
+-----v-----+
| User |
+------------+
lua
Copy code
+----------------------------+
| ML Cloud Platform |
| +-----------------------+ |
| | Model Training | |
| +-----------------------+ |
| | Data Storage | |
| +-----------------------+ |
| | Compute Resources | |
+----------------------------+
|
|
+-----v-----+
| Data Scientist |
+---------------+
New Cloud Application Opportunities
lua
Copy code
+--------------------------+
| Cloud Platform |
| +--------------------+ |
| | Data Storage | |
| +--------------------+ |
| | Data Processing | |
| +--------------------+ |
| | AI Analytics | |
+--------------------------+
|
+-----------------+------------------+
| | |
+----v----+ +----v----+ +----v----+
| Traffic | | Air | | Energy |
| Control | | Quality | | Usage |
| Sensors | | Sensors | | Sensors |
+---------+ +----------+ +---------+
lua
Copy code
+-----------------------------+
| Telemedicine Cloud |
| +------------------------+ |
| | Video Conferencing | |
| +------------------------+ |
| | AI Diagnostics | |
| +------------------------+ |
| | Health Record Storage | |
+-----------------------------+
|
+--------+---------+
| |
+---v---+ +----v----+
| Doctor | | Patient |
+--------+ +---------+
3. AI-Powered Virtual Personal Assistant for Enterprises
o Description: A cloud-based personal assistant designed to streamline business
tasks, scheduling, and information retrieval.
o Components: Natural Language Processing (NLP), data processing,
integration with enterprise tools (e.g., Microsoft 365, CRM), and user
authentication.
sql
Copy code
+----------------------------+
| Cloud-based AI Assistant |
| +------------------------+ |
| | Natural Language | |
| | Processing (NLP) | |
| +------------------------+ |
| | Enterprise Integration | |
| +------------------------+ |
| | Task Automation | |
+----------------------------+
|
+--------+--------+
| |
+---v---+ +---v---+
| Employee | Manager |
+---------+ +---------+
Each of these opportunities leverages the scalability and AI capabilities of cloud computing
to solve complex problems in urban management, healthcare, and enterprise efficiency. These
diagrams help illustrate the data flow and structure of each potential application.
WORKFLOWS:
1. Sequential Workflows
o Activities occur in a specific order, with each task beginning only after the
previous one is completed.
o Example: Order Processing Workflow
Steps: Order Received → Payment Processed → Order Packed →
Order Shipped → Delivery Confirmation
2. Parallel Workflows
o Multiple activities run simultaneously, with tasks only synchronizing at
defined points.
o Example: Product Development Workflow
Steps: Design and Prototyping can happen in parallel with Market
Research → After both are completed, they move to Product Testing.
3. Conditional Workflows
o Paths diverge based on conditions or decisions, allowing different workflows
based on specific criteria.
o Example: Customer Support Workflow
Steps: Issue Reported → Triage (Determine Issue Severity) → Low
Severity (Email Support) or High Severity (Immediate Call Center
Support).
4. Iterative Workflows
o Activities are repeated in cycles, often with evaluations or refinements after
each iteration.
o Example: Software Development Workflow (Agile)
Steps: Planning → Development → Testing → Review →
(Refinement/Feedback) → Deployment.
This diagram shows parallel and sequential coordination: Market research triggers product
development and marketing, which run in parallel but synchronize at points before moving to
Testing and Promotion.
In ZooKeeper, the state machine model controls the lifecycle of nodes and client
interactions. Each node in a distributed system can move through well-defined states (e.g.,
LOOKING, FOLLOWING, LEADING) and transitions are triggered by ZooKeeper’s coordination
mechanisms to achieve consensus. By coordinating node states, ZooKeeper provides reliable
services like distributed locking, configuration management, and leader election.
1. ZNodes
o ZooKeeper stores data in hierarchical nodes called znodes, which clients can
read, write, and watch for changes. These znodes act as markers in the system,
coordinating access to shared resources or data states.
o Each znode maintains a state (e.g., created, updated, deleted) that clients can
monitor, providing a way to synchronize distributed components.
2. Leader Election
o In distributed systems, a leader node may be needed to coordinate activities
among follower nodes.
o ZooKeeper uses the state machine model to elect a leader through consensus.
All nodes start in the LOOKING state and transition to FOLLOWING or LEADING
after the leader is elected. This helps maintain consistent decision-making in
distributed applications.
3. Watches and Notifications
o ZooKeeper allows clients to set watches on znodes, so they are notified when
the znode’s state changes.
o This asynchronous event-driven mechanism allows distributed applications to
react dynamically to changes in shared data, coordinating based on the current
state of resources or configurations.
4. Sessions and Ephemeral Nodes
o Each client session in ZooKeeper represents a connection with a set state and a
timeout. If the session times out, ZooKeeper automatically removes the
client’s ephemeral nodes, which are temporary znodes tied to the session’s
lifecycle.
o This approach is helpful for distributed systems to detect and handle client
failures gracefully, releasing resources automatically and keeping the system
state consistent.
Below is a diagram that illustrates the leader election process using ZooKeeper’s state
machine model, which coordinates the roles of each server in the cluster.
sql
Copy code
+--------------------+
| LOOKING |
| (Searching for |
| a leader) |
+--------------------+
|
|
+----------v----------+
| |
| Leader Election |
| using consensus |
| |
+----------+----------+
|
+--------------+--------------+
| |
+-------v-------+ +-----v-----+
| LEADING | | FOLLOWING |
| (Elected as | | (Following|
| the leader) | | the leader)|
+---------------+ +-----------+
In this state machine model for leader election:
This coordination model is fault-tolerant. If the leader fails, followers re-enter the LOOKING
state, triggering a new election process.
1. Distributed Locks
o By setting a lock in a znode, ZooKeeper coordinates access among distributed
clients.
o When the lock (state) changes, waiting clients are notified, enabling smooth
transitions and consistent access control across nodes.
2. Configuration Management
o ZooKeeper can manage configuration data for distributed systems, storing
configurations in znodes.
o Clients watch for configuration updates, and when changes occur, the new
state is distributed to all nodes in real-time.
3. Coordination of Distributed Queues
o ZooKeeper helps maintain queues by using znodes to manage task order and
availability.
o A state change (e.g., a task added to or removed from the queue) triggers other
nodes to update their actions accordingly.
ZooKeeper’s state machine model and its distributed consensus protocols provide powerful
coordination mechanisms. These allow for reliable distributed system functionalities, such as
leader election, distributed locking, and consistent state management, essential for building
robust and scalable applications.
These two steps enable the model to distribute tasks across many nodes, process data in
parallel, and then combine results to generate a single cohesive outcome.
MapReduce Workflow
1. Input Splitting: The data is split into multiple chunks, with each chunk being
processed independently.
2. Mapping Phase: Each chunk is processed by the Map function to produce
intermediate key-value pairs.
3. Shuffling and Sorting: The framework organizes the key-value pairs by key,
ensuring that each key's values are grouped together.
4. Reducing Phase: The Reduce function processes each group of key-value pairs to
produce final outputs.
5. Output Storage: The results are saved to a distributed storage system.
Imagine a simple use case for counting the occurrences of each word in a large set of
documents.
1. Input Data:
o Text documents to analyze, split across multiple files.
2. Mapping Phase:
o Each document is read, and the Map function emits a key-value pair for each
word: (word, 1).
bash
Copy code
Input text: "cat bat cat rat"
Output of Map: (cat, 1), (bat, 1), (cat, 1), (rat, 1)
bash
Copy code
Grouped Data: (cat, [1, 1]), (bat, [1]), (rat, [1])
4. Reducing Phase:
o The Reduce function adds up the counts for each word key, producing a total
count for each.
bash
Copy code
Output of Reduce: (cat, 2), (bat, 1), (rat, 1)
5. Final Output:
o The result is saved, showing the count of each word in the documents.
Advantages of MapReduce
I/O Intensive: Shuffling and sorting involve a large amount of disk I/O, which can
slow down performance.
Limited Expressiveness: The Map and Reduce paradigm is limited for more complex
workflows and iterative tasks.
Latency: Not ideal for real-time processing; MapReduce is more suited to batch
processing.
MapReduce Diagram
scss
Copy code
Input Data (Splits)
|
+----v----+
| Map |
+---------+
|
+-----------+-----------+
| | |
(key1, value1) (key2, value2) ... (keyN, valueN)
|
Shuffling & Sorting
|
+----v----+
| Reduce |
+---------+
|
Final Output
The MapReduce model revolutionized data processing by allowing distributed, parallel, and
scalable operations across large datasets. Despite its limitations, it remains influential,
forming the basis for newer big data frameworks such as Apache Hadoop and Apache Spark.
CASE STUDY:
"Grep the Web" is a term originally coined by Google to describe large-scale text processing
across the web. In cloud computing, it involves using distributed computing systems to
search, analyze, and process vast amounts of text data efficiently across many servers. This
approach is inspired by the traditional Unix grep command, which searches for patterns
within text files, but scaled up for the internet.
Objective: Find occurrences of the phrase "machine learning" in a large dataset of text files.
Example Output:
scss
Copy code
(machine learning, 4523)
The "Grep the Web" concept has evolved with cloud computing into a core data processing
task, serving as the foundation for large-scale search, data mining, and real-time data analysis
applications.
HPC on Cloud:
1. Scalability
o Cloud providers offer a virtually unlimited pool of resources, allowing users to
scale up or down as their workloads demand. This elasticity is essential for
HPC workloads, which may require massive parallel processing capabilities
for short periods.
o Users can provision thousands of CPUs or GPUs in minutes, accommodating
the needs of complex simulations or intensive computations.
2. Cost-Effectiveness
o Cloud HPC follows a pay-as-you-go pricing model, meaning users pay only
for the resources they consume, which can be far more economical than
maintaining dedicated, on-premises supercomputers.
o This model is particularly attractive for organizations with periodic or project-
based HPC needs, avoiding large upfront investments.
3. Specialized Hardware and Infrastructure
o Many cloud providers offer specialized hardware, such as high-memory
instances, Graphics Processing Units (GPUs), Tensor Processing Units
(TPUs), and Field-Programmable Gate Arrays (FPGAs), which can
significantly speed up HPC workloads.
o Options like high-performance storage (e.g., SSDs, parallel file systems), low-
latency networking, and direct interconnects (such as AWS Elastic Fabric
Adapter) are available for faster data access and efficient parallel processing.
4. Managed Services and Tools
o HPC on the cloud often includes managed services, such as workload
schedulers, cluster management, and monitoring tools, making it easier to
deploy and manage HPC clusters.
o Cloud providers also offer HPC-specific libraries, software packages, and
integrations with popular HPC tools like SLURM, OpenMPI, and Lustre,
which streamline operations for HPC users.
1. Compute Resources
o Cloud providers offer a wide range of compute instance types, from general-
purpose to compute-optimized or GPU-enabled instances, which can be
configured to suit the demands of various HPC workloads.
2. Storage Systems
o HPC applications require high-throughput, low-latency storage for
input/output data. Cloud storage options include network-attached storage
(NAS), parallel file systems (e.g., Amazon FSx for Lustre), and object storage
(e.g., Amazon S3).
3. Networking Infrastructure
o High-speed, low-latency networking is essential for efficient communication
between compute nodes. Cloud providers offer specialized networking
solutions, like AWS Elastic Fabric Adapter (EFA) and Azure InfiniBand, to
meet the needs of HPC workloads that require high levels of data exchange.
Amazon Web Services (AWS): Provides specialized HPC services such as AWS
ParallelCluster, FSx for Lustre, and Elastic Fabric Adapter (EFA).
Microsoft Azure: Offers Azure CycleCloud for HPC cluster management, InfiniBand
networking, and support for GPU and FPGA instances.
Google Cloud Platform (GCP): Provides HPC solutions with Compute Engine,
custom machine types, and integration with open-source HPC tools.
IBM Cloud and Oracle Cloud: Both provide HPC environments with support for
InfiniBand, bare-metal servers, and optimized HPC storage options.