数据库CH
数据库CH
15
Concurrency Control
This chapter describes how to control concurrent execution in a database, in
order to ensure the isolation properties of transactions. A variety of protocols are
described for this purpose. If time is short, some of the protocols may be omitted.
We recommend covering, at the least, two-phase locking (Sections 15.1.1), through
15.1.3, deadlock detection and recovery (Section 15.2, omitting Section 15.2.1),
and the phantom phenomenon (Section 15.8.3). The most widely used techniques
would thereby be covered.
It is worthwhile pointing out how the graph-based locking protocols gener-
alize simple protocols, such as ordered acquisition of locks, which students may
have studied in an operating system course. Although the timestamp protocols by
themselves are not widely used, multiversion two-phase locking (Section 15.6.2)
is of increasing importance since it allows long read-only transactions to run
concurrently with updates.
The phantom phenomenon is often misunderstood by students as showing
that two-phase locking is incorrect. It is worth stressing that transactions that scan
a relation must read some data to find out what tuples are in the relation; as long
as this data is itself locked in a two-phase manner, the phantom phenomenon will
not arise.
Exercises
15.20 What benefit does strict two-phase locking provide? What disadvantages
result?
Answer: Because it produces only cascadeless schedules, recovery is very
easy. But the set of schedules obtainable is a subset of those obtainable
from plain two phase locking, thus concurrency is reduced.
15.21 Most implementations of database systems use strict two-phase locking.
Suggest three reasons for the popularity of this protocol.
Answer: It is relatively simple to implement, imposes low rollback over-
head because of cascadeless schedules, and usually allows an acceptable
level of concurrency.
123
124 Chapter 15 Concurrency Control
15.22 Consider a variant of the tree protocol called the forest protocol. The
database is organized as a forest of rooted trees. Each transaction Ti must
follow the following rules:
n1 n2
n3 n4 n5 n6
T1 T2
lock(n1)
lock(n3)
write(n3)
unlock(n3)
lock(n2)
lock(n5)
write(n5)
unlock(n5)
lock(n5)
read(n5)
unlock(n5)
unlock(n1)
lock(n3)
read(n3)
unlock(n3)
unlock(n2)
Exercises 125
15.29 Show that there are schedules that are possible under the two-phase lock-
ing protocol, but are not possible under the timestamp protocol, and vice
versa.
Answer: A schedule which is allowed in the two-phase locking protocol
but not in the timestamp protocol is:
step T0 T1 T2
1 write(A)
2 write(A)
3 write(A)
4 write(B)
5 write(B)
This schedule cannot have lock instructions added to make it legal under
two-phase locking protocol because T1 must unlock (A) between steps 2
and 3, and must lock (B) between steps 4 and 5.
15.30 Under a modified version of the timestamp protocol, we require that a
commit bit be tested to see whether a read request must wait. Explain how
the commit bit can prevent cascading abort. Why is this test not necessary
for write requests?
Answer: Using the commit bit, a read request is made to wait if the
transaction which wrote the data item has not yet committed. Therefore,
if the writing transaction fails before commit, we can abort that transaction
alone. The waiting read will then access the earlier version in case of a
Exercises 127
multiversion system, or the restored value of the data item after abort in
case of a single-version system. For writes, this commit bit checking is
unnecessary. That is because either the write is a “blind” write and thus
independent of the old value of the data item or there was a prior read, in
which case the test was already applied.
15.31 As discussed in Exercise 15.19, snapshot isolation can be implemented
using a form of timestamp validation. However, unlike the multiversion
timestamp-ordering scheme, which guarantees serializability, snapshot
isolation does not guarantee serializability. Explain what is the key differ-
ence between the protocols that results in this difference.
Answer:
The timestamp validation step for the snapshot isolation level checks for
the presence of common written data items between the transactions.
However, write skew can occur, where a transaction T1 updates an item
A whose old version is read by T2 , while T2 updates an item B whose old
version is read by T1 , resulting in a non-serializable execution. There is no
validation of reads against writes in the snapshot isolation protocol.
The multiversion timestamp-ordering protocol on the other hand avoids
the write skew problem by rolling back a transaction that writes a data item
which has been already read by a transaction with a higher timestamp.
15.32 Outline the key similarities and differences between the timestamp based
implementation of the first-committer-wins version of snapshot isola-
tion, described in Exercise 15.19, and the optimistic-concurrency-control-
without-read-validation scheme, described in Section 15.9.3.
Answer: Both the schemes do not ensure serializability. The version num-
ber check in the optimistic-concurrency- control-without-read-validation
implements the first committer-wins rule used in the snapshot isolation.
Unlike the snapshot isolation, the reads performed by a transaction in
optimistic-concurrency- control-without-read-validation may not corre-
spond to the snapshot of the database. Different reads by the same trans-
action may return data values corresponding to different snapshots of the
database.
15.33 Explain the phantom phenomenon. Why may this phenomenon lead to
an incorrect concurrent execution despite the use of the two-phase locking
protocol?
Answer: The phantom phenomenon arises when, due to an insertion or
deletion, two transactions logically conflict despite not locking any data
items in common. The insertion case is described in the book. Deletion can
also lead to this phenomenon. Suppose Ti deletes a tuple from a relation
while Tj scans the relation. If Ti deletes the tuple and then Tj reads the
relation, Ti should be serialized before Tj . Yet there is no tuple that both
Ti and Tj conflict on.
An interpretation of 2PL as just locking the accessed tuples in a relation
is incorrect. There is also an index or a relation data that has information
128 Chapter 15 Concurrency Control
about the tuples in the relation. This information is read by any transaction
that scans the relation, and modified by transactions that update, or insert
into, or delete from the relation. Hence locking must also be performed on
the index or relation data, and this will avoid the phantom phenomenon.
15.34 Explain the reason for the use of degree-two consistency. What disadvan-
tages does this approach have?
Answer: The degree-two consistency avoids cascading aborts and offers
increased concurrency but the disadvantage is that it does not guarantee
serializability and the programmer needs to ensure it.
15.35 Give example schedules to show that with key-value locking, if any of
lookup, insert, or delete do not lock the next-key value, the phantom
phenomenon could go undetected.
Answer: In the next-key locking technique, every index lookup or insert
or delete must not only the keys found within the range (or the single
key, in case of a point lookup) but also the next-key value- that is, the key
value greater than the last key value that was within the range. Thus, if
a transaction attempts to insert a value that was within the range of the
index lookup of another transaction, the two transactions would conflict
ion the key value next to the inserted key value. The next-key value should
be locked to ensure that conflicts with subsequent range lookups of other
queries are detected, thereby detecting phantom phenomenon.
15.36 Many transactions update a common item (e.g., the cash balance at a
branch), and private items (e.g., individual account balances). Explain
how you can increase concurrency (and throughput) by ordering the op-
erations of the transaction.
Answer: The private items can be updated by the individual transaca-
tions independently. They can acquire the exclusive locks for the private
items (as no other transaction needs it) and update the data items. But
the exclusive lock for the common item is shared among all the transac-
tions. The common item should be locked before the transaction decides
to update it. And when it holds the lock for the common item, all other
transactions should wait till its released. But inorder that the common
item is updated correctly, the transaction should follow a certain pattern.
A transacation can update its private item as and when it requires, but be-
fore updating the private item again, the common item should be updated.
So, essentially the private and the common items should be accessed al-
ternately, otherwise the private item’s update will not be reflected in the
common item.
a. No possibility of deadlock and no starvation. The lock for the com-
mon item should be granted based on the time of requests.
b. The schedule is serializable.
15.37 Consider the following locking protocol: All items are numbered, and
once an item is unlocked, only higher-numbered items may be locked.
Exercises 129
Locks may be released at any time. Only X-locks are used. Show by an
example that this protocol does not guarantee serializability.
Answer: We have 2 transactions, T1 and T2 . Consider the following legal
schedule:
T1 T2
lock(A)
write(A)
unlock(A)
lock(A)
read(A)
lock(B)
write(B)
unlock(B)
lock(B)
read(B)
unlock(B)