UNIT-IV
Transaction Management
Prepared By:
Sameep N.Sinha
Transactions
• Interaction with the DBMS through
SQL
update Airlines set price = price -
price*0.1, status = “cheap” where
price < 5000
A transaction is a unit of interaction
ACID Properties
• Atomicity
• Consistency
• Isolation
• Durability
Database system must ensure ACID
properties
Atomicity and Consistency
• Single transaction
– Execution of a transaction: “all-or-
nothing”
Either a transaction completes in its
entirety
Or it “does not even start”
– As if the transaction never existed
– No partial effect must be visible
2 outcomes: A transaction COMMITs
Consistency and Isolation
• Multiple transactions
– Concurrent execution can cause an
inconsistent database state
– Each transaction executed as if isolated
from the others
Durability
• If a transaction commits the effects
are permanent
• But, durability has a bigger scope
– Catastrophic failures (floods, fires,
earthquakes)
What we will study…
• Concurrency Control
– Ensuring atomicity, consistency and
isolation when multiple transactions are
executed concurrently
• Recovery
– Ensuring durability and consistency in
case of software/hardware failures
Terminology
• Data item
– A tuple, table, block
• Read (x)
• Write (x, 5)
• Start (T)
• Commit (T)
• Abort (T)
• Active Transaction
– A transaction which has neither committed nor aborted
High level model
Transaction 1 Transaction 2
Transaction n
Transaction
Manager
Scheduler
Recovery
Manager
Disk
Cache Manager
Recoverability (1/2)
• Transaction T Aborts T T’
– T wrote some data items Read (x)
Write (x,
– T’ read items that T wrote k)
Read (y)
• DBMS has to…
Read (x)
– Undo the effects of T Write (y,
k’)
– Undo effects of T’
Commit
– But, T’ has already committed
Abort
Recoverability (2/2)
• Let T1,…,Tn be a set of transactions
• Ti reads a value written by Tk, k < i
• An execution of transactions is
recoverable if
Ti commits after all Tk commit
T1 T2 T1 T2
Write Write
(x,2) (x,2)
Read (x) Read (x)
Write Write
(y,2) (y,2)
Commit Commit
Cascading Aborts (1/2)
• Because T was aborted, T1,…, Tk also
have to be aborted
T T’ T’’
Read (x)
Write (x,
k)
Read (y)
Read (x)
Write (y,
k’)
Abort
Read
(y)
Cascading Aborts (2/2)
• Recoverable executions do not
prevent cascading aborts
• How can we prevent them then ?
T1 T2 T1 T2
Write Write
(x,2) (x,2)
Read (x)
Write
(y,2) Commit
Commit
Read (x)
Commit
Write
(y,2)
What we learnt so far…
Reading a value, committing a transaction
Recoverable Recoverable
Not recoverable
with without
cascading cascading
T1 T2 T1
aborts T2 T1
aborts T2
Write Write Write
(x,2) (x,2) (x,2)
Read (x) Read (x)
Write Write
(y,2) (y,2) Commit
Commit Commit
Read (x)
Commit
Write
(y,2)
Commit
Strict Schedule (1/2)
• “Undo”-ing the effects of a
transaction
– Restore the before image of the data
T item T
1 2
Write T1 T2
(x,1)
Equivalent to Write
Write
(x,1) Final
(y,3)
Write value of
Write
(y,3) y: 3
(y,1)
Commit
Commit
Read (x)
Abort
Strict Schedule (2/2)
Initial value of x: 1
T1 T2 T1 T2 T1 T2
Write Write Write
(x,2) (x,2) (x,2)
Write Write
(x,3) (x,3) Abort
Abort Abort Write
T1 restoresAbort
Should x be restored to 1 or 3? x to 3? (x,3)
T2 restores x to 2?
Do not read or write a value which has been
written by an active transaction until that
transaction has committed or aborted
The Lost Update Problem
T1 T2
Read (x)
Read (x)
Write (x,
200,000)
Commit
Write (x,
200)
Commit
Assume x is your account balance
Serializable Schedules
• Serial schedule
– Simply execute transactions one after
the other
• A serializable schedule is one which
equivalent to some serial schedule
SERIALIZABILITY THEORY
Serializable Schedules
T1: op11, op12, op13
T2: op21, op22, op23, op24
• Serial schedule
– Simply execute transactions one after
op21, op22, op23,
the other
op11, op12, op13
op21, op22, op23, op24
op11, op12, op13
op24
• Serializable schedule
– Interleave operations
– Ensure end result is equivalent to some
serial schedule
Notation
r1[x] = Transaction 1, Read (x)
w1[x] = Transaction 1, Write (x)
c1 = Transaction 1, Commit
a1= Transaction 1, Abort
r1[x], r1[y], w2[x], r2[y], c1, c2
Histories (1/3)
• Operations of transaction T can be
represented by a partial order.
r1[x]
w1[z] c1
r1[y]
Histories (2/3)
• Conflicting operations
– Of two ops operating on the same data
item, if one of them is a write, then the
ops conflict
– An order has to be specified for
conflicting operations
Histories (3/3)
• Complete History
Serializable Histories
• The goal: Ensure that the
interleaving operations guarantee a
serializable history.
• The method
– When are two histories equivalent?
– When is a history serial?
Equivalence of Histories
(1/2)
H ≅ H’ if
1. they are defined over the same set of
transactions and they have the same
operations
2. they order conflicting operations the
same way
Equivalence of Histories
(2/2)
Source: Concurrency Control and Recovery in Database Systems: Bernstein,
Serial History
• A complete history is serial if for
every pair of transactions Ti and Tk,
– all operations of Ti occur before Tk OR
– all operations of Tk occur before Ti
• A history is serializable if its
committed projection is equivalent to
a serial history.
Serialization Graph
T1 T3 T2
Serializability Theorem
A history H is serializable if its
serialization graph SG(H) is acyclic
On your own
How do recoverability, strict
schedules, cascading aborts fit into
the big picture?
LOCKING
High level model
Transaction 1 Transaction 2
Transaction n
Transaction
Manager
Scheduler
Recovery
Manager
Disk
Cache Manager
Transaction Management
Transaction
Manager
• Receives
Transaction 1
Transaction 2 Transactions
Transaction 3 • Sends operations to
. scheduler Read1(x)
. Write2(y,k)
. Read2(x)
Transaction n Commit1
Scheduler
• Execute op
Disk
• Reject op
• Delay op
Locking
• Each data item x has a lock
associated with it
• If T wants to access x
– Scheduler first acquires a lock on x
– Only one transaction can hold a lock on
x
• T releases the lock after processing
Locking is used by the scheduler to
ensure serializability
Notation
• Read lock and write lock
rl[x], wl[x]
• Obtaining read and write locks
rli[x], wli[x]
• Lock table
– Entries of the form [x, r, Ti]
• Conflicting locks
– pli[x], qlk[y], x = y and p,q conflict
• Unlock
rui[x], wui[x]
Basic 2-Phase Locking (2PL)
RULE 1 RULE 2
Receive pi[x]
pli[x] cannot be
released until pi[x] is
completed
is qlk[x] set NO
such that p Acquire
and q conflict? pli[x]
RULE 3 (2 Phase
Rule)
YES pi[x]
scheduled Once a lock is
pi[x] released no other
delayed locks may be
obtained.
The 2-phase rule
Once a lock is released no other locks
may be obtained.
T1: r1[x] w1[y] c1 T 1 T 2
T2: w2[x] w2[y] c2
H = rl1[x] r1[x] ru1[x] wl2[x] w2[x] wl2[y]
w2[y] wu2[x] wu2[y] c2 wl1[y] w1[y]
wu1[y] c1
Correctness of 2PL
2PL always produces serializable
histories
Proof outline
STEP 1: Characterize properties of
the scheduler
STEP 2: Prove that any history with
these properties is serializable
(That is, SG(H) is acyclic)
Deadlocks (1/2)
T1: r1[x] w1[y] c1
T2: w2[y] w2[x] c2
Scheduler
rl1[x] wl2[y] r1[x] w2[y] <cannot
proceed>
Deadlocks (2/2)
Strategies to deal with deadlocks
• Timeouts
– Leads to inefficiency
• Detecting deadlocks
– Maintain a wait-for graph, cycle
indicates deadlock
– Once a deadlock is detected, break the
cycle by aborting a transaction
• New problem: Starvation
Conservative 2PL
• Avoids deadlocks altogether
– T declares its readset and writeset
– Scheduler tries to acquire all required locks
– If not all locks can be acquired, T waits in a queue
• T never “starts” until all locks are acquired
– Therefore, it can never be involved in a deadlock
On your own
Strict 2PL (2PL which ensures only strict
schedules)
Extra Information
• Assumption: Data items are
organized in a tree
Can we come up with a better (more
efficient) protocol?
Tree Locking Protocol (1/3)
RULE 1 RULE 2
if x is an
Receive ai[x] intermediate node,
and y is a parent of
x, the ali[x] is
is alk[x] NO RULE 2 possible only if ali[y]
?
RULE 3
ali[x] cannot be
YES pi[x]
released until ai[x] is
scheduled
completed
ai[x]
delayed RULE 4
Once a lock is
released the same
lock may not be re-
obtained.
Tree Locking Protocol (2/3)
• Proposition: If Ti locks x before Tk,
then for every v which is a
descendant of x, if both Ti and Tk lock
v, then Ti locks v before Tk.
• Theorem: Tree Locking Protocol
always produces Serializable
Schedules
Tree Locking Protocol (3/3)
• Tree Locking Protocol avoids
deadlock
• Releases locks earlier than 2PL
BUT
• Needs to know the access pattern to
be effective
• Transactions should access nodes
from root-to-leaf
Multi-granularity Locking
(1/3)
• Granularity
– Refers to the relative size of the data
item
– Attribute, tuple, table, page, file, etc.
• Efficiency depends on granularity of
locking
• Allow transactions to lock at different
granularities
Multi-granularity Locking
(2/3)
• Lock Instance Graph
• Explicit and Implicit
Locks
• Intention read and
intention write locks
• Intention locks conflict
with explicit read and
write locks but not with
other intention locks
Source: Concurrency Control and Recovery in Database Systems: Bernstein,
Multi-granularity Locking
(3/3)
• To set rli[x] or irli[x], first hold irli[y] or
iwli[y], such that y is the parent of x.
• To set wli[x] or iwli[x], first hold
iwli[y], such that y is the parent of x.
• To schedule ri[x] (or wi[x]), Ti must
hold rli[y] (or wli[y]) where y = x, or y
is an ancestor of x.
• To release irli[x] (or iwli[x]) no child of
x can be locked by Ti
The Phantom Problem
• How to lock a tuple, which (currently)
does not exist?
T1: r1[x1], r1[x2], r1[X], c1
T2: w[x3], w[X], c2
rl1[x1], r1[x1], rl1[x2], r1[x2], wl2[x3],
wl[X], w2[x3], wu2[x3,X], c2, rl1[X],
ru1[x1,x2,X], c1
NON-LOCK-BASED
SCHEDULERS
Timestamp Ordering (1/3)
• Each transaction is associated with a
timestamp
– Ti indicates Transaction T with
timestamp i.
• Each operation in the transaction has
the same timestamp
Timestamp Ordering (2/3)
TO Rule
If pi[x] and qk[x] are conflicting
operations, then pi[x] is processed
before qk[x] iff i < k
Theorem: If H is a history representing
an execution produced by a TO
scheduler, then H is serializable.
Timestamp Ordering (3/3)
• For each data item x, maintain: max-rt(x), max-
wt(x), c(x)
• Request ri[x]
– Grant request if TS (i) >= max-wt (x) and c(x),
update max-rt (x)
– Delay if TS(i) > max-wt(x) and !c(x)
– Else abort and restart Ti
• Request wi[x]
– Grant request if TS (i) >= max-wt (x) and TS (i) >=
max-rt (x), update max-wt (x), set c(x) = false
– Else abort and restart Ti
Validation
• Aggressively schedule all operations
• Do not commit until the transaction
is “validated”
Summary
• Lock-based Schedulers
– 2-Phase Locking
– Tree Locking Protocol
– Multi-granularity Locking
– Locking in the presence of updates
• Non-lock-based Schedulers
– Timestamp Ordering
– Validation-based Concurrency Control
(on your own)
RECOVERY
Logging
• Log the operations in the
transaction(s)
• Believe the log
– Does the log say transaction T has
committed?
– Or does it say aborted?
– Or has only a partial trace (implicit
abort)?
• In case of failures, reconstruct the DB
from its log
The basic setup
Buffer Space Buffer Space for data
for each transaction and log
Transactions
T1
LOG
The Disk
T2
T3
Tk
Terminology
• Data item: an element which can be
read or written
– tuple, relation, B+-tree index, etc
Input x: fetch x from the disk to buffer
Read x,t: read x into variable local
variable t
Write x,t: write value of t into x
Output x: write x to disk
Example
update Airlines set price = price - price*0.1,
status = “cheap” where price < 5000
Read P, x
x -= x* 0.1
Write x,P
System fails here
Read S, y
y = “CHEAP”
Write y, S
Output P
System fails here
Output S
System fails here
Logs
• Sequence of log records
• Need to keep track of
– Start of transaction
– Update operations (Write operations)
– End of transaction (COMMIT or ABORT)
• “Believe” the log, use the log to
reconstruct a consistent DB state
Types of logs
• Undo logs
– Ensure that uncommitted transactions are
rolled back (or undone)
• Redo logs
– Ensure that committed transactions are
redone
• Undo/Redo logs
– Both of the above
All 3 logging styles ensure atomicity and
durability
Undo Logging (1/3)
• <START T>: Start of transaction T
• <COMMIT T>
• <ABORT T>
• <T, A, x>: Transaction T modified A
whose before-image is x.
Undo Logging (2/3)
<START T>
Read P, x U1: <T, X, v> should be
flushed before
x -= x* 0.1
Output X
Write x,P
<T, P, x>
Read S, y U2: <COMMIT T> should
y = “CHEAP” be flushed after all
Write y, S OUTPUTs
<T, S, y>
FLUSH LOG
Output P
Output S
<COMMIT T>
FLUSH LOG
Undo Logging (3/3)
• Recovery with Undo log
1. If T has a <COMMIT T> entry, do
nothing
2. If T has a <START T> entry, but no
<COMMIT T>
• T is incomplete and needs to be undone
• Restore old values from <T,X,v> records
• There may be multiple transactions
– Start scanning from the end of the log
Redo Logging (1/3)
• All incomplete transactions can be
ignored
• Redo all completed transactions
• <T, A, x>: Transaction T modified A
whose after-image is x.
Redo Logging (2/3)
<START T>
Read P, x R1: <T, X, v> and
<COMMIT T> should
x -= x* 0.1
be flushed before
Write x,P Output X
<T, P, x>
Read S, y
y = “CHEAP”
Write y, S
<T, S, y>
<COMMIT T>
FLUSH LOG Write-ahead
Logging
Output P
Output S
Redo Logging (3/3)
• Recovery with Redo Logging
– If T has a <COMMIT T> entry, redo T
– If T is incomplete, do nothing (add
<ABORT T>)
• For multiple transactions
– Scan from the beginning of the log
Undo/Redo Logging (1/3)
• Undo logging: Cannot COMMIT T
unless all updates are written to disk
• Redo logging: Cannot release
memory unless transaction commits
• Undo/Redo logs attempt to strike a
balance
Undo/Redo Logging (2/3)
<START T>
UR1: <T, X, a, b>
Read P, x should be flushed
x -= x* 0.1 before Output X
Write x,P U1: <T, X, v> should be
<T, P, x, a>
Read S, y flushed before
y = “CHEAP” Output X
Write y, S
U2: <COMMIT T> should
<T, S, y, b>
FLUSH LOG be flushed after all
Output P OUTPUTs
<COMMIT T> R1: <T, X, v> and
Output S <COMMIT T> should
be flushed before
Output X
Undo/Redo Logging (3/3)
• Recovery with Undo/Redo Logging
– Redo all committed transactions
(earliest-first)
– Undo all uncommitted transactions
(latest-first)
What happens if there is a crash when
you are writing a log? What happens if
there is a crash during recovery?
Checkpointing
• Logs can be huge…can we throw
away portions of it?
• Can we avoid processing all of it
when there is a crash?