CS3551 Distributed Computing Question Bank
CS3551 Distributed Computing Question Bank
The Chandy and Lamport snapshot algorithm is integral for determining the global states of distributed systems. Its importance lies in providing a consistent snapshot of the system's state, which is necessary for system reliability and recovering from faults. The algorithm uses two primary rules: the marker sending rule, which obligates processes to send a marker after recording their state but before sending other messages, and the marker receiving rule, which mandates processes not yet recorded to do so upon receiving a marker .
Nested transactions offer several advantages, particularly in concurrency and fault tolerance. Subtransactions within a nested transaction structure can execute concurrently, which increases the overall concurrency of the system. Additionally, subtransactions can independently commit or abort, which enhances fault tolerance, as the failure of a subtransaction does not necessitate the failure of the entire transaction, and the parent transaction can decide how to proceed .
An object in a distributed system is considered garbage if there are no longer any references to it within the system. Once identified as garbage, the memory it occupies can be reclaimed. This process is known as distributed garbage collection .
The two-phase commit protocol is designed to achieve consensus on committing or aborting a transaction across distributed systems. In the first phase, all participating entities vote on whether to commit or abort the transaction. In the second phase, a consensus decision is enforced; all participants either commit or abort the transaction based on the collective vote. This protocol ensures consistency and agreement across distributed transactions, even in the presence of failures .
Coordinated Universal Time (UTC) functions as an international standard for timekeeping that is based on atomic time. UTC signals are both synchronized and broadcasted regularly via land-based radio stations and satellites covering many areas of the world, ensuring a consistent reference time for synchronization across distributed systems .
External synchronization involves aligning the clocks of processes within a distributed system to an authoritative external time source, such as UTC. This is crucial because it ensures processes within the system operate under a consistent time framework. The synchronization is within a bound D>0, such that the time difference between the system clock Ci and the source S of UTC time remains less than D for all real times within a given interval .
Clock skew refers to the instantaneous difference between the readings of two clocks. It arises because computer clocks, like all clocks, are not perfect at all times. Clock drift occurs when clocks deviate from a reference clock due to counting time at different rates — specifically common in crystal-based clocks. The drift rate is defined as the change in the offset between the clock and an ideal reference clock per unit of time. These issues impact system timekeeping by causing inconsistencies and inaccuracies in time-based operations .
Checkpoints play a crucial role in failure recovery by providing a saved state of a system from which recovery can be initiated following a failure. They allow a system to revert to a known good state, thereby minimizing data loss and recovery time. The checkpoint algorithm typically involves saving the state of processes and messages at specific intervals or conditions, enabling the system to restore operations to the point of the last checkpoint in the event of a crash or error .
Serially equivalent transactions in distributed systems require that, for any pair of transactions, the order of conflicting operations (e.g., read and write) on objects accessed by both transactions can be determined and is consistent across all accesses. This ensures that the results of executing transactions concurrently are the same as if the transactions were executed serially, thereby preserving data integrity and consistency .
Strict two-phase locking protocols ensure transaction serialization by requiring a transaction to delay operations until it has exclusive locks on required objects. Locks obtained during a transaction are held until it commits or aborts, ensuring that no two transactions can modify the same data concurrently, which prevents conflicts and ensures consistent data state. However, this also implies potential wait times for transactions needing access to locked resources, possibly affecting system throughput .