0% found this document useful (0 votes)
8 views

Distributed Computing

Uploaded by

Edgar Osoro
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Distributed Computing

Uploaded by

Edgar Osoro
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

DISTRIBUTED COMPUTING

QUESTION ONE
a) Describe.
i) Distributed OS (2marks).
- It is a software over a collection of independent, networked, communicating, and physically
separate computational nodes.

ii) Scalability (2marks).


- The capability of a system to adapt to increased service load.

b) Define redundancy and explain its use in distributed systems (4marks).


- Replicating critical hardware and software components, so that if one of them fails, the others can
be used to continue.

c) Centralized systems vs distributed systems (6marks).

d) Motivation for development of RPC (4marks).


Abstraction of network communication - so as to enable transparent interaction between
processes across a network.
Interoperability - facilitates interoperability between heterogeneous systems and platforms
by providing a standardized mechanism for communication.

Distributed computing - there was a need for a communication mechanism that would
enable efficient and reliable interaction between distributed components.
Productivity and efficiency - enhances developer productivity and code maintainability by
abstracting the complexities of network programming.

e) Challenges of designing a scalable distributed system (6marks).


Scalability - ensuring that the system can handle increasing workloads and user demands
without sacrificing performance or reliability.
Consistency and coordination - maintaining data consistency and coordination across
distributed nodes while ensuring high availability and fault tolerance is a significant
challenge.
Fault Tolerance - building fault-tolerant distributed systems that can withstand node failures,
network partitions, and other forms of faults is crucial for ensuring system reliability and
availability.
Communication overhead - communication overhead can impact the performance and
scalability of distributed systems.
Consensus and coordination - achieving consensus and coordination among distributed
nodes in the absence of a centralized authority is challenging.

f) Security mechanisms used in distributed computing (6marks).


Encryption - transforms data into something an attacker cannot understand.

Authentication - used to verify the claimed identity of a user, client, server, host, or other entity.

Authorization - check whether that client is authorized to perform the action requested.

Auditing - auditing tools are used to trace which clients accessed what, and which way.

g) Describe.
i) Role of a distributed OS designer (2marks).

- Designing the overall architecture of the distributed operating system.

- Incorporating security mechanisms such as authentication, encryption, access control, and


audit.
- Developing algorithms and policies for efficient allocation and utilization of distributed
resources.

- Implementing mechanisms for synchronization and coordination among distributed


processes.

ii) Main components of a computer clock (2marks).


Clock Generator - produces a stable and accurate clock signal that serves as a reference for
timing operations within the computer system.
Clock Divider - divides the output frequency of the clock generator to derive lower-
frequency clock signals used by various components of the computer system.
Clock Distribution Network - distributes the clock signals generated by the clock generator
and divider to different parts of the computer system, including the CPU.
Clock Control Circuitry - manages the operation of the clock generator and divider.

h) Define.
i) Bully algorithm (2marks).

- processes have unique IDs, and the process with the highest ID is elected as the leader.
When a process detects the absence of a leader, it initiates an election by sending an election
message to all processes with higher IDs.

ii) RPC (2marks).


- is a protocol that allows a program to request a service from a program located on another
computer on a network without having to understand network details.

i) Explain how you can ensure your web service remains active even if there is a hardware or
software issue (4marks).
Recovery - process of restoring a system or data to a consistent and usable state after a
failure or disruption.
Redundancy - replicating critical hardware and software components, so that if one of them
fails, the others can be used to continue.
Distributed control - to avoid single points of failure.

j) Factors for selecting a victim to kill or rollback during deadlock (6marks).


Resource utilization - prioritizing processes that have consumed fewer resources or have
lower resource utilization levels to minimize the overall impact on system.
Priority - processes with lower priority to the system's operation may be selected as victims.
Wait time - processes that have been waiting for resources for a longer duration may be
considered as victims to reduce the waiting time for other processes and help alleviate the
deadlock situation more quickly.
Resource holding time - processes that have held resources for a longer period without
making progress may be selected as victims to release the resources they hold and deadlock.
Transaction state - processes involved in transactions that can be safely aborted or rolled
back without causing data inconsistencies or integrity violations may be chosen as victims to
resolve deadlock situations.

k) Explain they key security mechanisms in distributed computing (4marks).


Encryption - transforms data into something an attacker cannot understand.

Authentication - used to verify the claimed identity of a user, client, server, host, or other entity.

Authorization - check whether that client is authorized to perform the action requested.

Auditing - auditing tools are used to trace which clients accessed what, and which way.

l) Explain major security threats facing e-banking systems (6marks).


Interception - situation that an unauthorized party has gained access to a service or data.
Interruption - situation in which services or data become unavailable, unusable, destroyed,
and so on.
Modification - involve unauthorized changing of data.
Fabrication - additional data or activity are generated that would normally not exist.

m) Need for synchronization in distributed systems (6marks).


Consistency - where multiple processes can access shared data concurrently, synchronization
prevents data corruption and ensures that all replicas or copies of the data remain
consistent.
Concurrency Control - is essential for coordinating access to shared resources to avoid
conflicts and maintain correctness.
Deadlock Prevention - helps prevent deadlock situations by imposing order and coordination
among processes accessing shared resources.
Ordering and Serialization - ensures that operations are executed in a well-defined order,
preserving causality and consistency in distributed systems.
Resource Management - facilitates efficient resource management by regulating access to
scarce or limited resources.

n) Major characteristics of distributed systems applications (6marks).


Concurrency - they often involve multiple concurrent processes or components executing
simultaneously across different nodes or machines.
Distribution - Distributed systems applications span multiple nodes or machines
interconnected over a network, enabling communication across geographical locations.
Fault Tolerance - are designed to withstand failures and disruptions in individual components
or nodes without compromising system availability or functionality.
Scalability - are designed to scale out horizontally to accommodate growing workloads and
user demands.
Heterogeneity - they may operate in heterogeneous environments with diverse hardware
platforms, operating systems, programming languages, and communication protocols.

QUESTION TWO

a) Assumptions on which election algorithms are based (6marks).


- Each process in the system has a unique priority number.
- Whenever an election is held, the process having the highest priority number among the
currently active processes is elected as the coordinator.
- On recovery, a failed process can take appropriate actions to rejoin the set of active
processes.

b)
i) Define deadlock (2marks).

- Deadlock is a situation in a concurrent system where two or more processes are unable to
proceed because each is waiting for another to release a resource.

ii) Conditions that lead to deadlock (4marks).


Mutual Exclusion - once a process acquires a resource, it cannot be shared with other
processes until it's released.
Hold and Wait - a process must hold at least one resource and wait for additional resources
that are currently held by other processes.
No Preemption - if a process is holding a resource and requires additional resources that are
held by other processes, it cannot forcibly preempt those resources from the other
processes.
Circular Wait - there exists a circular chain of processes, with each process holding at least
one resource that is requested by the next process in the chain.
iii) Important issues in recovery from deadlock (4marks).
Detection - deadlock detection algorithms periodically check the system's state to identify if
deadlock has occurred.
Pre-emption - resources are forcibly taken away from some processes to allow others to
proceed.
Termination - can release the resources held by the terminated processes, allowing other
processes to proceed.
Recovery Mechanism - deadlock recovery mechanism involves balancing trade-offs between
system performance, reliability, and complexity.

c) Define middleware and explain roles of middleware in distributed systems (10marks).


Middleware - is software that provides services beyond those provided by the operating system to
enable the various components of a distributed system to communicate and manage data.

d) A process runs in one node and accesses data from another node, for load balancing, it
relocates to a different node. Describe transparencies for this process (6marks).
Access - hides differences in data representation and how a resource is accessed by a user.
Location - hides where exactly the resource is located physically.
Scaling - allows the system and applications to expand in scale without change to the system
structure or the application algorithms.d

e)
i) Major threats facing distributed systems (4marks).
Interception - situation that an unauthorized party has gained access to a service or data.
Interruption - situation in which services or data become unavailable, unusable, destroyed,
and so on.
Modification - involve unauthorized changing of data.
Fabrication - additional data or activity are generated that would normally not exist.
ii) Security mechanisms (4marks).
Encryption - transforms data into something an attacker cannot understand.

Authentication - used to verify the claimed identity of a user, client, server, host, or other entity.

Authorization - check whether that client is authorized to perform the action requested.

Auditing - auditing tools are used to trace which clients accessed what, and which way.

iii) Discuss the CIA triad with regards to distributed systems (6marks).

Confidentiality - refers to ensuring that sensitive information is accessible only to authorized


users or processes and protected from unauthorized access or disclosure.

Integrity - refers to maintaining the accuracy, consistency, and trustworthiness of data and
resources throughout their lifecycle, even in the presence of malicious attacks, errors, or
failures.

Availability - refers to ensuring that resources, services, and applications are accessible and
operational when needed, despite failures, disruptions, or attacks.

QUESTION THREE

a)

i) Election algorithms used for selecting a coordinating process (8marks).

Bully - processes have unique IDs, and the process with the highest ID is elected as the
leader. When a process detects the absence of a leader, it initiates an election by sending an
election message to all processes with higher IDs.

Ring - processes are organized in a logical ring structure. When a process detects the
absence of a leader, it initiates an election by sending an election message along the ring.

Centralized - a single process is responsible for coordinating the election process. When a
failure occurs or a new coordinator needs to be elected, processes send their candidacy
requests to the centralized coordinator.

Distributed - distribute the election process among multiple processes without relying on a
centralized coordinator. Involve processes exchanging messages and reaching a consensus to
elect a new coordinator.
Token ring - processes pass a token around the ring. The process holding the token acts as
the coordinator. When a process detects the failure of the current coordinator, it initiates an
election by circulating the election token around the ring.

ii) Assumptions the algorithms are based on (2marks).


- Each process in the system has a unique priority number.
- Whenever an election is held, the process having the highest priority number among the
currently active processes is elected as the coordinator.
- On recovery, a failed process can take appropriate actions to rejoin the set of active
processes.

iii) Conditions that lead to a deadlock (4marks).


Mutual Exclusion - once a process acquires a resource, it cannot be shared with other
processes until it's released.
Hold and Wait - a process must hold at least one resource and wait for additional resources
that are currently held by other processes.
No Preemption - if a process is holding a resource and requires additional resources that are
held by other processes, it cannot forcibly preempt those resources from the other
processes.
Circular Wait - there exists a circular chain of processes, with each process holding at least
one resource that is requested by the next process in the chain.

b) Why synchronization is needed in distributed systems (6marks).


Consistency - where multiple processes can access shared data concurrently, synchronization
prevents data corruption and ensures that all replicas or copies of the data remain
consistent.
Concurrency Control - is essential for coordinating access to shared resources to avoid
conflicts and maintain correctness.
Deadlock Prevention - helps prevent deadlock situations by imposing order and coordination
among processes accessing shared resources.
Ordering and Serialization - ensures that operations are executed in a well-defined order,
preserving causality and consistency in distributed systems.
Resource Management - facilitates efficient resource management by regulating access to
scarce or limited resources.

c)
i) Define distributed shared memory (2marks).

- Is a type of service that manages the memory across multiple nodes so that applications
that are running will have the illusion that they are running on shared memory.

ii) Difference between shared and distributed memory systems (4marks).

Memory Model.

Shared Memory System - in a shared memory system, all processors or cores access a single,
global address space.

Distributed Memory System - in a distributed memory system, each processor or node has
its own local memory.

Communication.

Shared Memory System - communication and data sharing occur implicitly through shared
memory.

Distributed Memory System - communication and data sharing occur explicitly through
message passing.

Programming Model.

Shared Memory System - Programming models for shared memory systems often use
threading APIs.

Distributed Memory System - programming models for distributed memory systems often
use message passing libraries.

Scalability and Performance.

Shared Memory System - scalability is limited by the number of processors or cores sharing
the global memory.

Distributed Memory System - can scale to a larger number of nodes and processors by
adding more nodes to the network.

iii) Advantages of distributed shared memory (4marks).

- Provides a large virtual memory space.

- Can handle large and `complex databases without sending data to the processor.

- Protect programmers from receiving and sending primitives.


- Less expensive as compared to a multiprocessor system.

- DSM scales when there are large nodes.

- DSM programs are portable as they use a common interface.

d) Transparencies for Zetech Digital System (10marks).

QUESTION FOUR

a)

i) Discuss election algorithm used in selecting a coordinating process (4marks).

Bully - processes have unique IDs, and the process with the highest ID is elected as the
leader. When a process detects the absence of a leader, it initiates an election by sending an
election message to all processes with higher IDs.
Ring - processes are organized in a logical ring structure. When a process detects the
absence of a leader, it initiates an election by sending an election message along the ring.

Centralized - a single process is responsible for coordinating the election process. When a
failure occurs or a new coordinator needs to be elected, processes send their candidacy
requests to the centralized coordinator.

Distributed - distribute the election process among multiple processes without relying on a
centralized coordinator. Involve processes exchanging messages and reaching a consensus to
elect a new coordinator.

Token ring - processes pass a token around the ring. The process holding the token acts as
the coordinator. When a process detects the failure of the current coordinator, it initiates an
election by circulating the election token around the ring.

ii) Assumptions election algorithms are based on (6marks).


- Each process in the system has a unique priority number.
- Whenever an election is held, the process having the highest priority number among the
currently active processes is elected as the coordinator.
- On recovery, a failed process can take appropriate actions to rejoin the set of active
processes.

b) Types of distributed file systems (4marks).

- File Transfer Protocol (FTP).

- Network File System (NFS).

- Andrew File System (AFS).

- Performance Comparison between FTP, NFS and AFS.

- Google File System (GFS).

- Amazon Dynamo.

- Hadoop Distributed File System (HDFS.

c)

i) Define distributed shared memory (2marks).


- Is a type of service that manages the memory across multiple nodes so that applications
that are running will have the illusion that they are running on shared memory.

ii) Benefits of distributed shared memory (4marks).

- Provides a large virtual memory space.

- Can handle large and `complex databases without sending data to the processor.

- Protect programmers from receiving and sending primitives.

- Less expensive as compared to a multiprocessor system.

- DSM scales when there are large nodes.

- DSM programs are portable as they use a common interface.

d) Major differences between call-by-value and call-by-reference (4marks).

Behaviour.

Call-by-Value - a copy of the actual parameter's value is passed to the function or


subroutine.

Call-by-Reference - a reference or address of the actual parameter is passed to the function


or subroutine.

Memory Usage.

Call-by-Value - requires memory to store a copy of the parameter's value.

Call-by-Reference - avoids creating copies of the parameter's value.

Data Type Compatibility.

Call-by-Value - suitable for passing primitive data types such as integers, floating-point
numbers, and characters.

Call-by-Reference - suitable for passing large data structures, arrays, or objects where
copying the entire value is inefficient.

Modification Visibility.

Call-by-Value - changes made to the parameter within the function are not visible outside
the function.
Call-by-Reference - changes made to the parameter within the function are immediately
visible outside the function.

e) Cloud computing models provided by google (10marks).

Infrastructure as a Service (IaaS) - Google Compute Engine (GCE) provides virtual machines
(VMs) running on Google's infrastructure.

Platform as a Service (PaaS) - Google App Engine (GAE) is a fully managed platform for
building and deploying web applications and services.

Software as a Service (SaaS) - Google Workspace (formerly G Suite) offers a suite of cloud-
based productivity and collaboration tools for businesses.

Database as a Service (DBaaS) - Google Cloud SQL provides managed database services for
MySQL, PostgreSQL, and SQL Server.

Function as a Service (FaaS) - Google Cloud Functions allows developers to deploy and run
event-driven serverless functions.

Container as a Service (CaaS) - Google Kubernetes Engine (GKE) provides managed


Kubernetes clusters for deploying, managing, and scaling containerized applications.

You might also like