INT1414 LamNhatMinh Baigiuaky CanDich
INT1414 LamNhatMinh Baigiuaky CanDich
[CẦN DỊCH]: Involve the same transactions with the exact same set of operations for each
transaction.
[CẦN DỊCH]: Order all pairs of conflicting operations from different transactions in the
same way (i.e., have identical precedence graphs).
[CẦN DỊCH]: Since T2′s operations in H1 (containing R2(x)) differ from T2′s operations in
H2,H3, and H4 (containing R2(z) instead), H1 cannot be conflict equivalent to H2,H3, or H4.
[CẦN DỊCH]: Now, let’s compare H2,H3, and H4. These three histories do have the same set
of operations for each transaction (T1,T2,T3). We need to compare their precedence
graphs:
[CẦN DỊCH]: Since all their precedence graphs are distinct, H2,H3, and H4 are not conflict
equivalent to each other.
[CẦN DỊCH]: Conclusion: None of the given histories are conflict equivalent to any other
history in the list.
[CẦN DỊCH]: Which of the above histories H1 - H4 are serializable? Refer to the following
table that show the conflicting operations and their odering in each of the histories:
[CẦN DỊCH]: H1 is not serializable since it has the following serialization graph, which
contains
[CẦN DỊCH]: H2 is not serializable since it has the following serialization graph, which
contains a cycle:
[CẦN DỊCH]: H3 is serializable since it it has the following serialization graph that is
equivalent
[CẦN DỊCH]: H4 is serializable since it it has the following serialization graph that is
equivalent to the serial history T2 →T1 →T3:
[CẦN DỊCH]: Give a history of two complete transactions which is not allowed by a strict
2PL scheduler but is accepted by the basic 2PL scheduler.
[CẦN DỊCH]: Basic 2PL allows this: T1 can write X and release its lock on X (entering its
shrinking phase) before committing. T2 can then acquire a lock on X, read the
(uncommitted) value written by T1, and commit. T1 commits later.
[CẦN DỊCH]: Strict 2PL does NOT allow this: T1 must hold its exclusive (write) lock on X
until T1 commits (C1). Therefore, T2 would be blocked and could not read X before T1
commits. The sequence where T2 reads X and commits (R2(X);C2) before T1 commits (C1)
is impossible under Strict 2PL.
[CẦN DỊCH]: H1: T3 reads from T1 (it reads x) T3 reads from T2 (it reads y)
[CẦN DỊCH]: Therefore, H3 is not recoverable. H4: T3 reads from T2 (it reads x) T3 reads
from T2 (it reads y)
[CẦN DỊCH]: (*) Give the algorithms for the transaction managers and the lock managers for
the distributed two-phase locking approach.
[CẦN DỊCH]: 2PC - Phase 1 (Prepare): Send Prepare to participating sites. Collect
Vote_Commit or Vote_Abort.
[CẦN DỊCH]: If all vote Commit: Send Global_Commit to participants. Log commit.
[CẦN DỊCH]: Wait for Acks. Then send Release_Locks(TID) to relevant LMs.
[CẦN DỊCH]: Abort_Transaction (T): Send Global_Abort to participating sites. Log abort.
Wait for Acks. Then send Release_Locks(TID) to relevant LMs.
[CẦN DỊCH]: Check lock table for conflicts on X with other transactions.
[CẦN DỊCH]: If no conflict: Grant lock, update lock table, send Lock_Granted.
[CẦN DỊCH]: Else: Add request to wait queue for X. (Manage deadlocks if they arise).
[CẦN DỊCH]: Remove all locks for TID from lock table.
[CẦN DỊCH]: Check wait queues for data items freed by TID; grant compatible locks to
waiting transactions.
[CẦN DỊCH]: If site can commit T’s changes locally (e.g., log updates): Send Vote_Commit.
[CẦN DỊCH]: (For Strict 3PL, ensure locks are released only aŁer local commit is durable,
oŁen triggered by this or Release_Locks).
[CẦN DỊCH]: (**) Modify the centralized 2PL algorithm to handle phantom read. Phantom
read occurs when two reads are executed within a transaction and the result returned by
the second read contains tuples that do not exist in the first one. Consider the following
example based on the airline reservation database discussed early in this chapter:
Transaction T1 , during its execution, searches the FC table for the names of customers who
have ordered a special meal. It gets a set of CNAME for customers who satisfy the search
criteria. While T1 is executing, transaction T2 inserts new tuples into FC with the special
meal request, and commits. If T1 were to re-issue the same search query later in its
execution, it will get back a set of CNAME that is different than the original set it had
retrieved. Thus, “phantom” tuples have appeared in the database.
[CẦN DỊCH]: When a transaction T issues a query with a predicate (e.g., SELECT CNAME
FROM FC WHERE SpecialMeal = ‘Yes’), the TM needs to communicate this predicate to the
central Lock Manager.
[CẦN DỊCH]: Any individual tuples read can still be locked with regular shared locks if
needed for other consistency reasons, though the predicate lock itself aims to prevent
phantoms.
[CẦN DỊCH]: Request an exclusive lock on the specific tuple being modified.
[CẦN DỊCH]: The LM will verify if this tuple (or its new/old state) falls within the range of
any existing predicate locks held by other transactions. If so, T must wait until those
predicate locks are released.
[CẦN DỊCH]: Lock Table Enhancement: The lock table must now store predicate locks. Each
entry might look like (TID, TableName, Predicate, LockMode, Status).
[CẦN DỊCH]: Check for conflicting predicate locks or conflicting tuple locks. A new predicate
lock P1 (from T1) conflicts with an existing predicate lock P2 (from T2) if their predicates
are not mutually exclusive and their modes conflict (e.g., one is exclusive, or P1 is exclusive
and P2 is shared on the same predicate range). More importantly, P1 conflicts if there’s an
existing exclusive lock on a tuple that satisfies P1.
[CẦN DỊCH]: Check if any existing tuple that is exclusively locked by another transaction Tj
satisfies the requested Predicate.
[CẦN DỊCH]: Store (TID, Table, Predicate, Mode, Granted) in the lock table.
[CẦN DỊCH]: The write operation must wait until Tj releases Pj. Add TID’s request to a wait
queue associated with Pj or Xtuple.
[CẦN DỊCH]: Remove all tuple locks and predicate locks held by TID from the lock table.
[CẦN DỊCH]: Process relevant wait queues for newly available tuples or predicate ranges.
[CẦN DỊCH]: Clock Tick Interval: Each clock ticks every 0.1 seconds. This is the smallest
time difference the system can natively distinguish using just the clock value.
[CẦN DỊCH]: Site A’s clock driŁs the maximum permissible amount fast. Let this driŁ be
Dmax seconds per 24 hours.
[CẦN DỊCH]: Site B’s clock driŁs the maximum permissible amount slow. Let this driŁ also
be Dmax seconds per 24 hours. The total relative driŁ between the clock at Site A and the
clock at Site B over 24 hours would be Dmax−(−Dmax
[CẦN DỊCH]: )=2×Dmax.
[CẦN DỊCH]: Condition for Successful Synchronization: To ensure that an event at Site A
that occurs globally just before an event at Site B (but close enough that their ideal
timestamps might be only one tick apart) is not incorrectly timestamped as later, the
maximum relative driŁ (2×Dmax) must be strictly less than one clock tick interval. If the
relative driŁ were equal to or greater than a tick, an earlier event on a fast-driŁing clock
could appear to have a later timestamp tick value than a slightly later event on a slow-
driŁing clock, potentially disrupting the transaction order if site IDs weren’t used or if the
clock values themselves were expected to maintain order for non-simultaneous events.
[CẦN DỊCH]: The maximum permissible driŁ per 24 hours for any single local site (relative
to a perfect clock) is less than 0.05 seconds.
[CẦN DỊCH]: (**) Incorporate the distributed deadlock strategy described In this chapter,
into the distributed 2PL algorithms that you designed in Problem 5.5.
[CẦN DỊCH]: Explain the relationship between transaction manager storage requirement
and transaction size (number of operations per transaction) for a transaction manager
using an optimistic timestamp ordering for concurrencycontrol.
[CẦN DỊCH]: For a transaction manager using optimistic timestamp ordering, its per-
transaction storage requirement is directly proportional to the transaction size (number of
operations). This is primarily due to the need to maintain two key sets for each transaction
until the validation phase:
[CẦN DỊCH]: The TM must store identifiers of all data items read by the transaction, along
with their versions or timestamps at the time of reading.
[CẦN DỊCH]: A larger number of read operations on distinct items means a larger Read Set,
thus requiring more storage.
[CẦN DỊCH]: A larger number of write operations or writes involving large data values
means a larger Write Set, thus requiring more storage.
[CẦN DỊCH]: During the validation phase, the TM uses these stored Read and Write Sets to
check for conflicts with other concurrently running or recently committed transactions. If
the transaction commits, the Write Set is used to update the database.
[CẦN DỊCH]: Therefore, as a transaction performs more read or write operations (i.e., its
size increases), the TM needs correspondingly more temporary storage to track these
operations and their associated data/metadata for that specific transaction.
[CẦN DỊCH]: (*) Give the scheduler and transaction manager algorithms for the distributed
optimistic concurrency controller described in this chapter.
[CẦN DỊCH]: Assign TID; initialize empty local Read Set (RSi) and Write Set (WSi).
[CẦN DỊCH]: For each participating site Sp, send Validate_Request(TID, ts(T_i), RS_i^{S_p},
WS_i^{S_p}) to its SC.
[CẦN DỊCH]: Local Validation: Check TiSp’s RSiSp and WSiSp for conflicts against other
transactions Tk based on their Write Sets and execution/commit phases relative to TiSp
(e.g., ensuring RSiSp did not read items subsequently overwritten by a committed Tk before
Ti’s validation, and WSiSp doesn’t conflict with WS(TkSp) if phases overlapped
significantly).
[CẦN DỊCH]: Apply writes from WSiSp to local database items, using tscommit(Ti) as the
version timestamp.
[CẦN DỊCH]: Discard any state for TiSp (no writes were made to DB yet).
[CẦN DỊCH]: If a distributed execution model were used for Distributed 2PL (D2PL), the
transaction manager and lock manager algorithms would adapt to a more hierarchical or
decentralized control flow for individual transactions. In this model, a global transaction T
might be decomposed into several subtransactions, each potentially coordinated by a local
Transaction Manager (TM) at the site where its operations are primarily executed.
[CẦN DỊCH]: The TM role would likely split into a Global TM (GTM) for the overall
transaction and Local TMs (LTMs) for subtransactions at participating sites.
[CẦN DỊCH]: Global TM (Originating/Coordinating Site for the parent transaction T):
[CẦN DỊCH]: Instead of directly requesting all locks, it delegates lock acquisition for
operations within a subtransaction to the respective LTM.
[CẦN DỊCH]: Manages the commit/abort protocol hierarchically (e.g., Nested 2PC). It acts as
the coordinator for the LTMs.
[CẦN DỊCH]: Commit: Sends Prepare to LTMs. If all LTMs vote Commit, GTM sends
Global_Commit to LTMs. Locks are released aŁer the global commit is finalized.
[CẦN DỊCH]: Abort: If any LTM votes Abort or GTM decides to abort, it sends Global_Abort
to LTMs.
[CẦN DỊCH]: Still responsible for assigning global transaction IDs and timestamps (if used
for deadlock resolution).
[CẦN DỊCH]: Lock Acquisition: Requests necessary locks from the Lock Manager(s) (LM) at
the site(s) where Tj’s data resides (oŁen its local LM). It acts like the original D2PL TM but
for a smaller scope.
[CẦN DỊCH]: Reporting to GTM: Reports status (e.g., readiness to commit) to the GTM.
[CẦN DỊCH]: On Prepare from GTM: Ensures its subtransaction Tj is ready (local logs
flushed, etc.), then sends Vote_Commit or Vote_Abort to GTM.
[CẦN DỊCH]: On Global_Commit from GTM: Instructs local LM/DPs to commit Tj ’s changes
and eventually release locks associated with Tj (on behalf of global T).
[CẦN DỊCH]: On Global_Abort from GTM: Instructs local LM/DPs to abort Tj’s changes and
release locks.
[CẦN DỊCH]: Receives lock requests primarily from LTMs coordinating subtransactions at
its site, or from LTMs at other sites if a subtransaction needs to access data managed by this
LM.
[CẦN DỊCH]: Locks are still acquired on behalf of the global transaction ID to ensure overall
isolation.
[CẦN DỊCH]: Global Deadlock Detection: Becomes more complex. Paths in Wait-For Graphs
might involve chains of subtransactions waiting for each other across different sites. For
example, Ti,siteA→Tj,siteA (subtransactions of global Ti,Tj at site A), where Tj,siteA might
be waiting for another subtransaction Tj,siteB of the same global transaction Tj, which in
turn waits for Tk,siteB.
[CẦN DỊCH]: Locks are typically still held until the Global TM decides on the final commit or
abort status of the parent transaction T. The Release_Locks command would originate from
the GTM and cascade through the LTMs to the LMs.
[CẦN DỊCH]: Yes, serializability can be quite restrictive. There are distributed histories that
are correct—meaning they maintain local database consistency and overall mutual
consistency—but are not serializable. These oŁen occur when operations have semantic
properties (like commutativity) that serializability doesn’t consider, or when slightly
relaxed consistency models are acceptable.
[CẦN DỊCH]: R1(Y from S2 reading Y0+2); (Note: T1 sees T2′s update to Y)
[CẦN DỊCH]: R2(X from S1 reading X0+1); (Note: T2 sees T1′s update to X)
[CẦN DỊCH]: Local Consistency: Each site performs its increments correctly based on the
values read. For example, Site S1 correctly updates X from X0 to X0+1 by T1, and then from
X0+1 to X0+3 by T2. The integrity of local data is maintained.
[CẦN DỊCH]: Mutual Consistency: The final state (X=X0+3,Y=Y0+3) is the same state that
would be achieved by any serial execution of T1 and T2:
[CẦN DỊCH]: +2)+1=Y0+3. Since the non-serializable history H yields a final state equivalent
to all possible serial executions for these specific operations (due to the commutative
nature of the increments), it can be considered “correct” for this application.
[CẦN DỊCH]: We need to look at the conflicts between the low-level operations:
[CẦN DỊCH]: For data item X (at Site S1): T1′s write (W1(X)) occurs before T2′s read (R2
(X)) of X. This establishes a dependency T1→T2.
[CẦN DỊCH]: For data item Y (at Site S2): T2′s write (W2(Y)) occurs before T1′s read (R1
(Y)) of Y. This establishes a dependency T2→T1.
[CẦN DỊCH]: The precedence graph contains a cycle: T1→T2→T1. Therefore, history H is not
serializable.
[CẦN DỊCH]: (*) Discuss the site failure termination protocol for 2PC using a distributed
communication topology.
[CẦN DỊCH]: The coordinator and other participants will time out waiting for Pj’s vote.
[CẦN DỊCH]: Thus, the coordinator and all other operational participants will decide to
abort the transaction.
[CẦN DỊCH]: If Pj fails aŁer sending its vote (e.g., Vote-Commit and entering READY state)
but before making a final local decision:
[CẦN DỊCH]: Pj’s vote has already been broadcast to other participants and the coordinator.
[CẦN DỊCH]: Other operational sites will proceed with their decision-making based on Pj’s
vote and all other votes. The transaction will reach a consistent outcome (commit or abort)
at all operational sites.
[CẦN DỊCH]: When Pj recovers, it will be in an uncertain (READY) state. It must query other
participants or the coordinator to determine the global transaction outcome and then act
accordingly (commit or abort locally). The distributed topology allows Pj to ask any other
participant.
[CẦN DỊCH]: This is the more complex scenario and highlights 2PC’s potential for blocking.
[CẦN DỊCH]: Participants will time out waiting for any message and will abort the
transaction unilaterally.
[CẦN DỊCH]: If C fails aŁer sending Prepare but before its own decision/vote is effectively
known to all participants for them to make a final decision:
[CẦN DỊCH]: Participants might have already sent their votes to each other.
[CẦN DỊCH]: If Pi is in the INITIAL state (hasn’t voted yet, perhaps didn’t receive Prepare
due to C’s failure during broadcast): It can time out and abort unilaterally.
[CẦN DỊCH]: If Pi has received Prepare and has collected votes from other participants:
[CẦN DỊCH]: If Pi receives a Vote-Abort from any other participant Pk: Pi can immediately
decide to abort the transaction, log it, and inform other participants if necessary.
[CẦN DỊCH]: If Pi has received Vote-Commit from all other known operational participants
and has itself voted Commit (is in READY state):
[CẦN DỊCH]: This is the blocking scenario. Pi (and all other operational participants in
READY) cannot unilaterally decide to commit because the failed coordinator (or another
failed participant whose vote was not yet received by all) might have caused an abort. They
also cannot unilaterally abort because the coordinator (and all other participants) might
have decided to commit.
[CẦN DỊCH]: The participants are blocked until the coordinator recovers or a more
advanced mechanism (like electing a new coordinator, which has its own complexities and
limitations in standard 2PC) is employed. The distributed communication helps them all
realize they are in this blocked state, but doesn’t inherently resolve it in standard 2PC.
[CẦN DỊCH]: If C fails aŁer its global decision (Commit/Abort) could be inferred or was
known by at least one participant (e.g., its vote was part of the distributed exchange,
enabling a decision):
[CẦN DỊCH]: Participants in the READY state that haven’t yet made a final decision can
query other participants.
[CẦN DỊCH]: If any participant Pk determined the final outcome (e.g., it received all votes
including the coordinator’s implicit or explicit one, and decided), it can inform Pi. Pi can
then proceed to commit or abort.
[CẦN DỊCH]: If no participant could independently determine the outcome before C’s failure
became critical (e.g., C’s vote was the last one needed for a commit decision), they remain
blocked.
[CẦN DỊCH]: (+) Enhanced Information Sharing: Participants can directly query each
other’s states or votes if the coordinator fails, potentially resolving uncertainty faster than
in a centralized 2PC where they can only talk to the coordinator.
[CẦN DỊCH]: (-) Does Not Eliminate Blocking: The fundamental blocking issue of 2PC (when
all operational participants are READY but the coordinator’s final decision or a critical
missing vote prevents resolution) is not solved by simply distributing participant
communication. Participants still cannot safely commit or abort unilaterally in this specific
state without risking inconsistency.
[CẦN DỊCH]: To overcome the blocking issue, a non-blocking commit protocol like Three-
Phase Commit (3PC) would be required.
[CẦN DỊCH]: (*) Design a 3PC protocol using the linear communication topology.
[CẦN DỊCH]: Forward Pass (Pi from Pi−1 to Pi+1; PN initiates backward):
[CẦN DỊCH]: Each Pi: If can commit, logs READY, forwards “Vote-Yes” signal. If aborts, logs
ABORT, forwards “Vote-No” signal.
[CẦN DỊCH]: Each Pi: Forwards aggregated vote. If any “Vote-No” received/sent, Pi logs
ABORT, forwards “Vote-No”.
[CẦN DỊCH]: If receives “Vote-Yes” from P1: All participants are READY. Proceed to Phase 2.
[CẦN DỊCH]: If receives “Vote-No”: At least one aborted. Initiate Global Abort (Phase 3b).
[CẦN DỊCH]: Coordinator (C): Logs PRE_COMMIT. Sends PRE_COMMIT_ORDER to P1. Enters
PRECOMMIT_WAIT.
[CẦN DỊCH]: Forward Pass (Pi from Pi−1 to Pi+1; PN initiates backward):
[CẦN DỊCH]: Each Pi (if READY): Logs PRE_COMMITTED. Enters PRECOMMIT. Forwards
PRE_COMMIT_ORDER.
[CẦN DỊCH]: If receives ACK_PRE_COMMIT from P1: All participants are PRECOMMIT.
Proceed to Phase 3a.
[CẦN DỊCH]: If times out: Initiate Global Abort (Phase 3b). Phase 3a: Global Commit
[CẦN DỊCH]: Coordinator (C): Logs COMMIT. Sends GLOBAL_COMMIT_ORDER to P1. Enters
COMMITTED.
[CẦN DỊCH]: Forward Pass (Pi from Pi−1 to Pi+1):
[CẦN DỊCH]: Each Pi (if PRECOMMIT): Commits locally (applies changes, releases locks).
Logs COMMITTED. Enters COMMITTED. Forwards GLOBAL_COMMIT_ORDER.
[CẦN DỊCH]: (PN completes the chain). Phase 3b: Global Abort
[CẦN DỊCH]: Coordinator (C): Logs ABORT. Sends GLOBAL_ABORT_ORDER to P1. Enters
ABORTED.
[CẦN DỊCH]: Each Pi: Aborts locally (discards changes, releases locks). Logs ABORTED.
Enters ABORTED. Forwards GLOBAL_ABORT_ORDER.
[CẦN DỊCH]: Participant Failure: The chain is broken. The Coordinator (if alive) or an
elected new Coordinator (among participants) takes charge. It polls reachable participants
for their states.
[CẦN DỊCH]: If all reachable are PRECOMMIT (or some mix of READY and PRECOMMIT):
Drive to PRECOMMIT then COMMIT.
[CẦN DỊCH]: If all reachable are READY (no PRECOMMIT): Drive to ABORT (safest) or
attempt PRECOMMIT.
[CẦN DỊCH]: (*) In our presentation of the centralized 3PC termination protocol, the first
step involves sending the coordinator’s state to all participants. The participants move to
new states according to the coordinator’s state. It is possible to design the termination
protocol such that the coordinator, instead of sending its own state information to the
participants, asks the participants to send their state information to the coordinator. Modify
the termination protocol to function in this manner.
[CẦN DỊCH]: The modification involves changing the centralized 3PC termination protocol
so that when recovery or timeout handling is initiated by the Coordinator (C), it polls
participants for their states instead of immediately sending its own. This allows C to make a
decision based on the collective state of the system.
[CẦN DỊCH]: Here’s how the modified termination protocol (primarily from the perspective
of Coordinator C recovering or a participant initiating contact with C) would function:
Modified Centralized 3PC Termination Protocol
[CẦN DỊCH]: A Participant (Pi) times out waiting for a message from C and contacts C.
Coordinator-Driven Termination (e.g., upon C’s recovery):
[CẦN DỊCH]: C consults its own log for the last known state of T.
[CẦN DỊCH]: If C’s log shows a definitive final state (COMMITTED or ABORTED), C resends
the corresponding final message (Global_Commit or Global_Abort) to any participant that
hasn’t acknowledged it. The protocol then effectively terminates for T.
[CẦN DỊCH]: If C’s log shows an intermediate state (e.g., WAIT, PRECOMMIT_DECISION,
PRECOMMIT_WAIT) or is unclear, C
[CẦN DỊCH]: C waits for Report_State(TID, P_k_State) from each participant, collecting their
current states (e.g., INITIAL, READY, PRECOMMIT, COMMITTED, ABORTED).
[CẦN DỊCH]: C sends Global_Commit(TID) to all participants (especially those not yet
COMMITTED).
[CẦN DỊCH]: Else if all responsive Pk report READY (none are PRECOMMIT):
[CẦN DỊCH]: Upon receiving Ack_PreCommit from all (or timing out), C proceeds: if all ack,
C sends Global_Commit(TID); if timeout, C sends Global_Abort(TID).
[CẦN DỊCH]: Else (e.g., all responsive Pk are INITIAL, or some INITIAL and some READY but
C’s own state indicates it never passed WAIT):
[CẦN DỊCH]: Participant-Initiated Termination (e.g., Pi times out and queries C):
[CẦN DỊCH]: Participant Pi (in state SPi, e.g., READY or PRECOMMIT) sends
Query_Outcome(TID, My_State_is_S_Pi) to C.
[CẦN DỊCH]: If C knows the final outcome for T (its state is COMMITTED or ABORTED): C
sends the corresponding global message (Global_Commit or Global_Abort) back to Pi.
[CẦN DỊCH]: If C is uncertain (e.g., it just recovered and hasn’t finished polling, or is in an
intermediate state): C can use Pi’s reported state as input. It may trigger the “Coordinator-
Driven Termination” (polling other participants if needed) to determine the global outcome
and then inform Pi.
[CẦN DỊCH]: This modified approach ensures the coordinator makes its recovery decision
based on the most up-to-date information from the entire system, reducing the chances of
making a decision that conflicts with a state already reached by a participant (though 3PC is
designed to prevent such irreconcilable conflicts).
[CẦN DỊCH]: (**) In Sect. 5.4.6 we claimed that a scheduler which implements a strict
concurrency control algorithm will always be ready to commit a transaction when it
receives the coordinator’s “prepare” message. Prove this claim.
[CẦN DỊCH]: A scheduler implementing a strict concurrency control algorithm ensures that
a transaction T1 cannot read or write a data item X that has been written by another
transaction T2 until T2 commits or aborts. Crucially, this means if T1 writes X, it holds an
exclusive lock (or an equivalent mechanism) on X until T1 itself commits or aborts.
[CẦN DỊCH]: Let T be a transaction for which a coordinator sends a Prepare message to a
participant scheduler S. For S to be “ready to commit,” it must be able to make
[CẦN DỊCH]: T’s changes permanent if the global decision is to commit, without being
forced to abort due to dependencies on other active (uncommitted) transactions.
[CẦN DỊCH]: According to the definition of strictness, if X was previously written by another
transaction T′, T′ must have already committed for T to be allowed to read X. T cannot read
data written by an active (uncommitted) transaction (no “dirty reads”).
[CẦN DỊCH]: Therefore, the data read by T is stable and will not be rolled back due to a
subsequent abort of another transaction. T’s validity is not dependent on any other
uncommitted transaction through its reads.
[CẦN DỊCH]: Under strictness, T must have acquired an exclusive lock on X before writing,
and this lock is held until T commits or aborts.
[CẦN DỊCH]: This means no other transaction T′ can read or write X while T is active and
uncommitted.
[CẦN DỊCH]: Thus, T’s writes are isolated and have not been seen or overwritten by any
other active transaction.
[CẦN DỊCH]: When the scheduler S receives the Prepare message for T:
[CẦN DỊCH]: All of T’s operations at site S have been successfully executed according to the
strict concurrency control rules.
[CẦN DỊCH]: Its reads (RT(X)) are from already committed data.
[CẦN DỊCH]: Its writes (WT(X)) are protected by locks that prevent other active
transactions from accessing them.
[CẦN DỊCH]: There is no risk of cascading aborts involving T that would be triggered by
another transaction aborting, because T has not read any uncommitted data.
[CẦN DỊCH]: Since T’s state at site S is not contingent on the outcome of any other active
transaction, the scheduler S has no data consistency reason (stemming from dependencies
on other uncommitted work) to refuse to commit T’s local operations. All necessary locks
are held by T, and all data it has read is from a committed state. Therefore, if T has
completed its operational phase successfully under these strict rules, the scheduler S is in a
position to guarantee that T’s local changes can be made permanent.
[CẦN DỊCH]: Thus, a scheduler implementing a strict concurrency control algorithm will
always be “ready to commit” a transaction when it receives the coordinator’s Prepare
message, meaning it can vote “yes” to the prepare request. (This readiness is from a data
consistency standpoint; other issues like resource exhaustion are separate).
[CẦN DỊCH]: (**) Assuming that the coordinator is implemented as part of the transaction
manager and the participant as part of the scheduler, give the transaction manager,
scheduler, and the local recovery manager algorithms for a nonreplicated distributed DBMS
under the following assumptions.
[CẦN DỊCH]: The scheduler implements a distributed (strict) two- phase locking
concurrency control algorithm.
[CẦN DỊCH]: The commit protocol log records are written to a central database log by the
LRM when it is called by the scheduler.
[CẦN DỊCH]: The LRM may implement any of the protocols that have been discussed
[CẦN DỊCH]: (e.g., fix/no-flush or others). However, it is modified to support the distributed
recovery procedures as we discussed in Sect. 5.4.6.
[CẦN DỊCH]: The TM at the originating site coordinates the global transaction.
[CẦN DỊCH]: Wait for Lock_Granted(TID, X) from SCSX. (Handle timeouts or deadlock
abortion signals).
[CẦN DỊCH]: If lock granted, instruct the Data Processor (DP) at SX to perform the actual
read/write.
[CẦN DỊCH]: Phase 1 (Prepare): Send Prepare(TID) message to Schedulers (SCpart) at all
participating sites.
[CẦN DỊCH]: Set a timer and wait for Vote(TID, decision) (COMMIT/ABORT) from all
SCpart.
[CẦN DỊCH]: Send Global_Abort(TID) to all participating SCpart (if any operations were
sent).
[CẦN DỊCH]: Request its local LRM to log (TID, END_TRANSACTION). Scheduler (SC)
Algorithm (Participant, includes Lock Manager)
[CẦN DỊCH]: The SC at each participating site manages local locks (Strict D2PL) and acts as
a 2PC participant.
[CẦN DỊCH]: Grant lock: Update lock table (e.g., add (TID, X, Shared/Exclusive)).
[CẦN DỊCH]: Else: Add request to a wait queue for X. (Manage local deadlocks or participate
in distributed deadlock detection).
[CẦN DỊCH]: If TID can commit locally (all operations completed, no local integrity
violations):
[CẦN DỊCH]: Request local LRM.LogWrite(TID, ‘PREPARE’). This log record must be forced
to stable storage.
[CẦN DỊCH]: Release all locks held by TID at this site (due to strictness, locks are held until
this point).
[CẦN DỊCH]: On Timeout in READY state: Initiate participant termination protocol (e.g.,
contact coordinator; if coordinator down, contact other participants as per defined protocol
– though 2PC can still block here).
[CẦN DỊCH]: Local Recovery Manager (LRM) Algorithm (At each site) The LRM manages the
site’s log and performs recovery.
[CẦN DỊCH]: Called by local SC (for 2PC records: PREPARE, COMMIT, ABORT,
END_TRANSACTION) or DP (for BEGIN, operation-specific redo/undo info).
[CẦN DỊCH]: Analysis Pass: Scan log to identify all transactions, their last known states,
dirty pages, etc.
[CẦN DỊCH]: Redo Pass: Redo operations for all transactions that have a COMMIT log record
to ensure their changes are on disk (especially for fix/no-flush type policies).
[CẦN DỊCH]: Undo Pass: Undo operations for transactions that have no COMMIT or
PREPARE record, or have an ABORT record.
[CẦN DỊCH]: Identify In-Doubt Transactions: For any transaction TID that has a (TID,
‘PREPARE’) log record but no subsequent (TID, ‘COMMIT’) or (TID, ‘ABORT’) record:
[CẦN DỊCH]: The LRM must not unilaterally commit or abort TID.
[CẦN DỊCH]: The LRM informs the local SC about TID’s in-doubt status.
[CẦN DỊCH]: The SC then initiates the termination protocol for TID (by contacting its
Coordinator TM) to determine the final global outcome. Locks for in- doubt transactions are
typically reacquired based on log information until their fate is known.
[CẦN DỊCH]: This integrated approach ensures that local operations adhere to strict 2PL,
the commit protocol ensures atomicity via 2PC with LRM logging, and the LRM’s recovery
procedure correctly handles distributed transactions, especially those in an “in-doubt” state.
[CẦN DỊCH]: (*) Write the detailed algorithms for the no-fix/no- flush local recovery
manager.
[CẦN DỊCH]: Create UPDATE log record (with OldValue for undo, NewValue for redo). Add
to log buffer.
[CẦN DỊCH]: Modify data page in memory buffer. (Page is now dirty; NO-STEAL prevents it
from being written to disk if transaction is uncommitted).
[CẦN DỊCH]: AŁer log force, acknowledge commit. (Modified pages are now “committed
dirty” and can be flushed later due to NO-FORCE).
[CẦN DỊCH]: Append ABORT log record to log buffer; force log records for this transaction
(up to ABORT) to stable log.
[CẦN DỊCH]: Perform UNDO: Scan transaction’s log records backward. For each UPDATE:
[CẦN DỊCH]: AŁer UNDO, append END log record; force CLRs and this END record to stable
log.
[CẦN DỊCH]: Write a CHECKPOINT_END log record containing snapshots of the Active
Transaction Table (ATT) and Dirty Page Table (DPT); force to stable log.
[CẦN DỊCH]: (No data pages are forced to disk by the checkpoint itself).
[CẦN DỊCH]: Find the last checkpoint. Initialize ATT and DPT from it.
[CẦN DỊCH]: Reconstruct ATT (transaction statuses: running, committed, aborted) and DPT
(dirty pages and their RecoveryLSN - earliest LSN causing dirtiness) at the point of crash.
[CẦN DỊCH]: Identify “loser” transactions (not committed). For distributed systems, identify
“in-doubt” transactions (logged PREPARE but no final outcome).
[CẦN DỊCH]: Start scan from earliest RecoveryLSN in DPT. Scan log forward.
[CẦN DỊCH]: For each logged UPDATE or CLR: If the change recorded in the log is not yet on
the disk page (check page’s LSN if available, or if page is in DPT and its RecoveryLSN is ≤
current log record’s LSN), reapply the change (NewValue) to the page in memory.
[CẦN DỊCH]: For each “loser” transaction (excluding in-doubt transactions, which require
distributed resolution):
[CẦN DỊCH]: Scan its log records backward from its last LSN.
[CẦN DỊCH]: For UPDATE records: Restore OldValue to the page in memory, write and force
a CLR to log.
[CẦN DỊCH]: For CLRs: Follow NextUndoLSN pointer in CLR to find previous operation to
undo.
[CẦN DỊCH]: AŁer completing UNDO for a loser, write and force its END log record.
[CẦN DỊCH]: The LRM implements no-fix/no-flush protocol. Give detailed algorithms for the
transaction manager, scheduler, and local recovery managers.
[CẦN DỊCH]: Requests appropriate lock for X from Central Scheduler (SC/CLM).
[CẦN DỊCH]: On lock grant, instructs Data Processor (DP) at X’s site to perform operation.
(DP ensures its LRM logs writes).
[CẦN DỊCH]: Initiates 2PC: Sends Prepare to SC/CLM (for locks) and to LRMs at data sites
(where writes occurred).
[CẦN DỊCH]: Collects votes. If all COMMIT: Sends Global_Commit. Else: Sends Global_Abort.
[CẦN DỊCH]: Sends Global_Abort to SC/CLM and relevant LRMs. Logs abort.
[CẦN DỊCH]: Else: Adds to wait queue. (Performs centralized deadlock detection; aborts
victim if needed).
[CẦN DỊCH]: On Prepare(TID): Votes COMMIT (as it’s always ready from a data
perspective).
[CẦN DỊCH]: Local Recovery Manager (LRM) - At each Data Site (No-Fix/No-Flush: NO-
STEAL/NO-FORCE)
[CẦN DỊCH]: Log_Update(TID, Page, OldVal, NewVal): (Called by local DP for writes)
[CẦN DỊCH]: Appends UPDATE log record to buffer. (Page becomes dirty in memory; NO-
STEAL means not written to disk if TID is uncommitted).
[CẦN DỊCH]: 2PC Participation (for data changes, if site had writes):
[CẦN DỊCH]: On Prepare(TID): Forces all log records for TID (including UPDATEs) and a
new PREPARE_DATA record to stable log. Votes COMMIT.
[CẦN DỊCH]: On Global_Commit(TID): Forces COMMIT_DATA log record. (Data pages not
forced now due to NO-FORCE). Acknowledges.
[CẦN DỊCH]: On Global_Abort(TID): Forces ABORT_DATA log record. Performs local UNDO
for TID (writes & forces Compensation Log Records - CLRs), then forces an END_DATA
record. Acknowledges.
[CẦN DỊCH]: Checkpoint(): Fuzzy checkpoint: Forces log records for Active Transaction
Table (ATT) & Dirty Page Table (DPT) snapshots to disk. No data pages are forced.
[CẦN DỊCH]: REDO: Reapplies logged changes (UPDATEs/CLRs) for all transactions from a
calculated start point if not yet on disk (due to NO-FORCE).
[CẦN DỊCH]: UNDO: For “loser” transactions (not in-doubt): Rolls back changes using
OldValues, writing and forcing CLRs and then an END record. (Simplified by NO-STEAL).
[CẦN DỊCH]: In-doubt transactions are reported to their TM coordinator for resolution.
[CẦN DỊCH]: For each of the four replication protocols (eager centralized, eager distributed,
lazy centralized, lazy distributed), give a scenario/application where the approach is more
suitable than the other approaches. Explain why.
[CẦN DỊCH]: Why: Ensures immediate, strong consistency vital for financial accuracy, with
a master as the single source of truth for updates.
[CẦN DỊCH]: Best for: Real-time Critical State Management (e.g., Telecom Switch Sessions in
a small cluster).
[CẦN DỊCH]: Why: Guarantees all nodes have an identical, perfectly synchronized state
instantly, crucial where any discrepancy is catastrophic.
[CẦN DỊCH]: Best for: E-commerce Product Catalogs or Content Management Systems.
[CẦN DỊCH]: Why: Master allows fast updates by administrators; numerous replicas
provide high read scalability for users who can tolerate slight data staleness.
[CẦN DỊCH]: Best for: Collaborative Tools with Offline Capabilities (e.g., Distributed
Document Editors).
[CẦN DỊCH]: Why: Maximizes write availability and local performance, allowing users to
work offline; data syncs later, prioritizing access over immediate consistency.
[CẦN DỊCH]: (a) Customer Service Group Data Replication Strategy:
[CẦN DỊCH]: For the customer service group’s PC cluster, full replication of the ITEM,
STOCK, CUSTOMER, CLIENT-ORDER, ORDER, and ORDER-LINE tables
[CẦN DỊCH]: to each PC is recommended. This ensures that all data required for queries and
processing orders is locally available on each employee’s machine, maximizing query speed.
[CẦN DỊCH]: Given the need for flexibility (any employee, any client), consistent updates
(especially for STOCK, ORDER processing), and leveraging local databases, an Eager
Distributed protocol (like Read-One/Write-All - ROWA, or Read- One/Write-All-Available -
ROWA-A using a two-phase commit or a consensus algorithm) would be most suitable.
[CẦN DỊCH]: Strong Consistency: Updates (orders, payments, stock changes) are applied to
all replicas within the same transaction. This ensures all employees see consistent data,
preventing issues like overselling stock or incorrect billing.
[CẦN DỊCH]: Fast Local Queries: With full local copies, the 80% query workload is served
directly from each PC’s database, ensuring speed.
[CẦN DỊCH]: Flexibility & Write Distribution: Any PC can initiate an update. The protocol
ensures this update is consistently applied across the cluster. This fits the requirement that
“each employee must be ableable to work with any of the clients” without a central
bottleneck for initiating writes.
[CẦN DỊCH]: Leverages PC Cluster: This approach fully utilizes the “cluster of PCs each
equipped with its own database” architecture, as each PC participates in maintaining the
consistent global state.
[CẦN DỊCH]: model, different PCs in the cluster could be designated as “masters” for
different subsets of data (e.g., specific warehouses or customer segments). Writes would be
directed to the relevant master, which then updates slaves. Reads remain local. However,
pure Eager Distributed better aligns with the “any employee, any client, any PC” flexibility.
[CẦN DỊCH]: For management’s analytical purposes, data from the operational systems
(specifically ITEM, STOCK, CUSTOMER, CLIENT-ORDER, ORDER,
[CẦN DỊCH]: ORDER-LINE) should be replicated into a centralized data warehouse or
analytical database. This database would be specifically structured and optimized for
complex queries and reporting.
[CẦN DỊCH]: Analytical Workload: The primary goal is analysis of historical trends and
consumer behavior, not real-time transaction processing. Data does not need to be
millisecond-current; a delay (e.g., data refreshed nightly or several times a day) is usually
acceptable for quarterly strategic decisions.
[CẦN DỊCH]: Batch Updates (ETL): Updates to the data warehouse would typically occur in
batches (e.g., via Extract, Transform, Load - ETL processes). These “refresh transactions” are
propagated from the operational “master” systems to the “slave” analytical database aŁer
operational transactions have committed.
[CẦN DỊCH]: Performance Isolation: This approach isolates the heavy analytical queries
from the operational systems, preventing performance degradation of the customer-facing
applications.
[CẦN DỊCH]: Data Transformation: Replication into a data warehouse oŁen involves
transforming and aggregating data to better suit analytical needs, which is a hallmark of
lazy, batch-oriented processes.
[CẦN DỊCH]: The consistency concern for the data warehouse is that it accurately reflects
the state of the operational data as of the last refresh time. Strong consistency is vital during
each batch update to the warehouse, but the overall relationship with the live data is
eventually consistent.
[CẦN DỊCH]: (*) An alternative to ensuring that the refresh transactions can be applied at all
of the slaves in the same order in lazy single master protocols with limited transparency is
the use of a replication graph as discussed in Sect. 6.3.3. Develop a method for distributed
management of the replication graph.
[CẦN DỊCH]: A distributed method for managing the replication graph, as used in lazy
primary copy (or single master with multiple data item types) protocols to ensure 1SR by
ordering refresh transactions, can be developed as follows. The goal is to allow a new
transaction Tk to proceed only if its execution and subsequent refresh operations won’t
create a serialization anomaly with already committed transactions whose refresh
operations are still propagating.
[CẦN DỊCH]: Core Idea & Graph Elements
[CẦN DỊCH]: Graph: The conceptual global replication graph consists of:
[CẦN DỊCH]: Edges: Directed edges (Ta → Tb) represent a precedence, meaning Ta must be
serialized before Tb in the global history. These precedences arise from:
[CẦN DỊCH]: Distributed Nature: No single site holds the entire graph. Each site (PMs and
Slaves) maintains a partial view relevant to its operations.
[CẦN DỊCH]: When a PM (say PMa) commits an original update transaction Ti:
[CẦN DỊCH]: It records Ti and its serialization dependencies with other transactions
previously committed at PMa (e.g., Tp → Ti).
[CẦN DỊCH]: It disseminates information about Ti (its unique ID, data items written, its
direct local precedences) to other PMs and relevant Slave sites. This can be done
optimistically aŁer commit or as part of a validation step for future transactions.
[CẦN DỊCH]: It sends refresh transactions (RTi) for Ti to all relevant Slave sites.
[CẦN DỊCH]: When a Slave site (Sj) receives and processes RTi:
[CẦN DỊCH]: It uses its local concurrency control to determine the serialization order of RTi
with respect to other refresh transactions it’s handling (e.g., establishing a local order RTp
→ RTi or RTi → RTq).
[CẦN DỊCH]: Sj records these locally established precedences. This information contributes
to the global graph view.
[CẦN DỊCH]: Sj informs other relevant sites (e.g., involved PMs or a set of graph validators)
about significant new precedences it has enforced, especially if they involve refresh
transactions from different PMs.
[CẦN DỊCH]: Distributed Management & Cycle Detection for a New Transaction Tk
[CẦN DỊCH]: This process is typically initiated by the PM (say PMx) where a new
transaction Tk is submitted, before Tk is committed.
[CẦN DỊCH]: PMx identifies the data items Tk will access and the operations (read/ write).
[CẦN DỊCH]: Step 2: Distributed Information Gathering (by PMx as coordinator for Tk’s
validation)
[CẦN DỊCH]: PMx queries other PMs and relevant Slave sites. The query asks for:
[CẦN DỊCH]: Information about recently committed transactions (Tc) that conflict with Tk
(i.e., access common data items with at least one write).
[CẦN DỊCH]: The current known precedences involving these Tc transactions, including
their status (e.g., Tc committed at PMy, RTc sent to slaves S1,S2, RTc processed at S1
establishing RTc → RTd).
[CẦN DỊCH]: Status of refresh transaction propagation for these conflicting Tcs (i.e., which
slaves are still pending).
[CẦN DỊCH]: Step 3: Local Graph Construction & Cycle Check (at PMx)
[CẦN DỊCH]: PMx constructs a temporary, relevant portion of the global replication graph
using:
[CẦN DỊCH]: Information about Tk and its potential local precedences at PMx.
[CẦN DỊCH]: Information gathered from other sites about conflicting Tcs and their
established/potential precedences at various slaves.
[CẦN DỊCH]: PMx tentatively adds Tk and its implied dependencies to this partial graph. For
example, if Tk writes item X and Tc (already committed) also wrote item X, any slave
receiving both RTk and RTc will serialize them in some order. PMx needs to consider these
potential future slave-side serializations.
[CẦN DỊCH]: PMx checks this constructed graph portion for cycles. A cycle indicates that
Tk’s execution could lead to a non-1SR history.
[CẦN DỊCH]: No Cycle: Tk can proceed to execute and commit at PMx. Upon Tk’s commit,
PMx disseminates information about Tk and its established precedences, allowing other
sites to update their partial views of the replication graph.
[CẦN DỊCH]: Sites continuously update their partial views based on:
[CẦN DỊCH]: Processing of refresh transactions at slaves and the new precedences
established there (disseminated by slaves or inferred).
[CẦN DỊCH]: Completion of refresh transaction propagation for a given transaction (once all
slaves acknowledge, that transaction’s role in forming cycles due to “in-flight” refreshes
diminishes).
[CẦN DỊCH]: Timestamps: Logical clocks (Lamport or vector timestamps) assigned by PMs
to transactions can help in default ordering, reducing ambiguity, and potentially simplifying
cycle detection or resolution.
[CẦN DỊCH]: Validator Nodes: A designated set of validator nodes could be responsible for
maintaining a more comprehensive (though still potentially slightly delayed) view of the
graph and assisting in cycle detection queries, rather than PMx querying all sites.
[CẦN DỊCH]: This distributed method avoids a single point of failure for graph management
and allows for concurrent transaction processing, with checks performed dynamically as
new transactions contend for resources and serialization order.
[CẦN DỊCH]: There are many possible vote assignments. We will discuss a very simple case
where we assign one vote to each of the sites. Then for both x and y V = 3, and we can set,
for example,
[CẦN DỊCH]: Vr = Vw = 2
[CẦN DỊCH]: Note: It is important to note that when votes are being assigned to sites, they
are actually being assigned to replicas. Assigning it directly to sites would give, in this case,
V = 4.. Although this works in this particular case it does not work in general.
[CẦN DỊCH]: For example, consider the case where there are 100 sites and 10 replicas.
Setting V = 100, and Vr = 50 Vw = 51 is wrong because there is no way to obtain 50 or 51
votes from 10 replicas (assuming no additional weight to some replicas).
[CẦN DỊCH]: Also, it is dangerous (in this example) to set Vr = Vw = 3. Again, this would
work in this case, but it would severely restrict the cases in which one can terminate
transactions. To demonstrate this, prepare a table like the above for the case where Vr = Vw
= 3, and compare it with what is given above. It will be clear that there are far fewer cases
when it is possible to terminate transactions. The objective in setting Vr and Vw is to
maximize the number of cases where you can terminate transactions.