Solved Question Paper Questions
Solved Question Paper Questions
Questions
What do you understand by the term
“TRANSACTION” in a database?
A transaction is defined as unit of work in a database system. It accesses and possibly modifies
various data objects like tuples, relations etc. Database systems that deal with large number of
transactions is called transaction processing systems. A transaction is a unit of data processing.
A transaction would involve manipulation of one or more values in database. Some
transactions can do writing of data without reading a data value.
Example 1: pseudo code for transaction to withdraw amount from an account ‘X’ :
; Assume that we are doing this transaction for person whose account number is X.
TRANSACTION WITHDRAWAL (withdrawal_amount)
Begin transaction
IF X exist then
READ X.balance
WRITE X.balance
COMMIT
ELSE
DISPLAY “TRANSACTION CANNOT BE PROCESSED”
ELSE DISPLAY “ACCOUNT X DOES NOT EXIST”
End transaction
Example 2: Pseudo code for transaction to transfer amount from one account to another
account
; transfers transfer_amount from x’s account to y’s account
; assuming both accounts exist
TRANSACTION (x, y, transfer_amount)
Begin transaction
If X AND Y exist then
READ x.balance
x.balance=x.balance-transfer_amount
READ y.balance
y.balance=y.balance+transfer_amount
COMMIT
ROLLBACK
ELSE DISPLAY (“ACCOUNT X OR Y DOES NOT EXIST”)
End transaction
COMMIT makes sure that all the changes made by transactions are made permanent.
ROLLBACK terminates the transactions and rejects any changes made by the transaction.
Discuss the properties of the
transactions.
• Atomicity
• Consistency
• Isolation or Independence
• Durability or permanence
Atomicity : It defines transaction as an atomic unit of work and it has to be done completely or
not at all. Its maintained by transaction management component in DBMS.Example next slide.
If the transaction fails after completion
Before: X: 1000 Y: 500 of T1 but before completion of T2.( say,
Transaction T after write(X) but before write(Y)), then
amount has been deducted from X but not
T1 T2
added to Y. This results in an inconsistent
database state. Therefore, the transaction
Read(X) Read(Y)
must be executed in entirety in order to
X:=X-100 Y:=Y+100 ensure correctness of database state.
Write(X) Write(Y)
1. To enforce isolation (one transaction should not interleave with another transaction)
Let us consider Transactions T3 and T4. T3 reads the balance of account X and subtracts a
withdrawal amount of Rs. 5000. T4 reads the balance of account X and adds an amount of
Rs.3000.
Problems of Concurrent transactions :
1. Lost Updates: Update of one transaction is overridden by another transaction. Suppose the two
transactions T3 and T4 run concurrently and they happen to be interleaved in the following way.
Assume the initial vale of X to be 10000.
After the execution of both the transactions the value X is 13000 while the semantically correct
value should be 8000. The problem occurred as the update made by T3 has been overwritten by
T4. The root cause of the problem is that one of the updates has been lost and we say that lost
update has occurred.
Another way in which the lost updates occur are:
Here T5 and T6 updates the same item X. Thereafter T5 decides to undo its action and rolls back
causing the value of X to 2000. In this case the update performed by T6 has got lost and a lost
update is said to have occurred.
Unrepeatable reads : If transaction T1 reads an item twice and the item is changed by an
another transaction T2 between the two reads hence T1, finds two different values on its two
reads. Suppose T7 reads X twice during its execution. If it did not update X itself it could be very
disturbing to see a different value of X in its next read. But this occurs if, there is an update
operation between the two read operations.
Thus, the inconsistent values are read and results of the transaction may be in error.
3. Dirty Reads : A transaction reads a value which has been updated by another transaction.
This update has not been committed and the later transaction aborts.
For example, T10 reads a value which has been updated by T9. This update has not been
committed and T9 aborts.
Here T10 reads a value that has been updated by transaction T9 that has been aborted. Thus
T10 has read a value that would never exist in the database and hence the problem. Here the
problem is isolation of transaction.
Inconsistent Analysis : The problem as shown with transactions T1 and T2 where two
transactions interleave to produce incorrect result during an analysis by Audit is an example of
such a problem. This problem occurs when more than one data items are being used for
analysis, while another transaction has modified some of those values and some are yet to be
modified. Thus an analysis transaction reads values from the inconsistent state of the database
that results in inconsistent analysis.
From the above problems, we can conclude that the prime reason for the problems of
concurrent transaction is that a transaction reads an inconsistent state of the database that has
been created by other transactions.
These problems cannot occur if the transaction do not read the same data values. The conflict
occurs only if 1 transaction updates a data value while another is reading or writing the data
value.
Common technique to overcome this is to restrict access to data items that are being read or
written by one transaction and is being written by another transaction. This technique is known
as locking.
Explain serializable scheduling.
Serialisability theory was developed to determine if concurrency problems have occurred when
two operations conflict with each other. Serialisability theory attempts to determine the
correctness of the schedules.
A serial schedule is a schedule in which execution of the transactions is in a serial order. All the
transactions in the schedule are executed one after the other. i.e. after one transaction is
completed, another transaction begins its execution. A schedule where operations are
executed without being interleaved (mixed with each other ) . A serial schedule is a schedule in
which either transaction T1 is completely done before T2 or transaction T2 is completely done
before T1.
As the schedule is executed in a serial manner, no inconsistency is possible. Serial order leads to
lesser throughput.
Example :
T1->T2 T2->T1
An interleaved execution of T1 and T2
Schedule T1 T2
Read (A) Read(A)
Read(B) Read(B)
Write(B) Write(B)
Write(A) Write(A)
Now, we have to find whether this interleaved schedule would be performing read and write in
the same order as that of a serial schedule. If it’s the same, then its equivalent to a serial
schedule otherwise its not. If its not equivalent to a serial schedule, then it may result in
problems due to concurrent transactions.
Serialisability : Any schedule that produces the same results as a serial schedule is called a
serializable schedule.
3. If there is any cycle in the graph, the schedule is not serializable, otherwise, find the
equivalent serial schedule of the transaction by traversing the transaction nodes starting
from the node that has no input edge.
Example to construct precedence graph :
Step 1 : We draw the nodes for transactions T1 and T2. Number of nodes = Number of
transactions.
T1 T2
Step 2 : In the above schedule, transaction T2 reads data item X, which is subsequently written
by T1, so there is an edge from T2 to T1.
T1 T2
Step 3 : Also, T2 reads a data item Y, which is subsequently written by T1, thus there is an edge
from T2 to T1. This edge exists, so there is no need to redo it.
less resource utilization and low throughput High throughput and resources are utilised
properly.
Example for serial schedule :
Explain locks.
Locks ensure that the interleaved concurrent transactions do not have any concurrency related
problem. A lock is a variable that is associated with a data item in the database. A lock can be
placed by a transaction on a shared resource that it desires to use. When this is done, the data
item is available for the exclusive use for that transaction i.e., other transactions are locked out
of that data item. When a transaction that has locked a data item does not desire to use it any
more, it should unlock the data item so that other transactions can use it. If a transaction tries
to lock a data item that is already locked by some other transaction, it cannot do so and waits
for the data item to be unlocked. The component of DBMS that controls and stores lock
information is called the Lock Manager.
• Binary lock : This locking mechanism has two states for a data item : locked or unlocked.
• Multiple-mode locks : In this locking type each data item can live in three states read locked
or shared locked, write locked or exclusive locked or unlocked.
Binary locks :
Multiple mode locks :It offers two locks: shared locks and exclusive locks.
▪ Shared lock :
• It is requested by a transaction that wants to just read the value of data item.
• A shared lock on a data item does not allow an exclusive lock to be placed but permits any
number of shared locks to be placed on that item.
▪ Exclusive lock :
• No other transaction can place either a shared lock or an exclusive lock on a data item that
has been locked in an exclusive mode.
Explain 2-phase locking protocol.
It’s a concurrency control method in which locking and unlocking is done in two phases. The
two phases of two-phase locking protocol are :
Phase 1 : The lock acquisition phase or growing phase : If a transaction T wants to read an
object, it needs to obtain the S (shared) lock. If T wants to modify an object, it needs to obtain X
(exclusive) lock. No conflicting locks are granted to a transaction. New locks on items can be
acquired but no lock can be released till all the locks required by the transaction are obtained.
Phase 2 : Lock Release Phase or shrinking phase: The existing locks can be released in any order
but no new lock can be acquired after a lock has been released. The locks are held only till they
are required.
A 2 phase locking always results in serializable schedule. The 2PL protocol has been approved
for it correctness.
▪ There are two types of 2PL :
2. Strict 2PL
The basic 2PL allows release of lock at any time after all the locks have been acquired.
The basic 2PL suffers from the problem that it can result into loss of atomic / isolation property
of transaction as theoretically speaking once a lock is released on a data item it can be modified
by anther transaction before the first transaction aborts or commits.
To avoid the above problem we use strict 2PL. The strict 2PL is graphically depicted below.
Basic disadvantage of strict 2PL is that it restricts concurrency as it locks the item beyond the
time it is needed by a transaction.
2PL solves the problem of concurrency and atomicity, but introduces another problem known
as deadlock.
Serializability is mainly an issue of handling write operation. Because any inconsistency may
only be created by write operation. Multiple reads on a database item can happen in parallel.
2-Phase Locking protocol restricts this unwanted read/write by applying exclusive lock.
Moreover, when there is an exclusive lock on an item it will only be released in shrinking phase.
Due to this restriction there is no chance of getting any inconsistent state.
TRANSACTIONS T1 AND T2 THAT OBEY 2PL
T1 T2
read_lock(Y); read_lock(X);
read_item(Y); read_item(X);
write_lock(X); write_lock(Y);
unlock(Y); unlock(X);
read_item(X); read_item(Y);
X:=X+Y; Y:=X+Y;
write_item(X); write_item(Y);
unlock(X); unlock(Y);
A deadlock is a condition that occurs when two or more different database tasks are waiting for
each other and none of the task is willing to give up the resources that other task needs.
Consider two transactions and a schedule involving these transactions :
T1 T2 Schedule
X_lock A X_lock A T1: X_lock A
X_lock B X_lock B T2: X_lock B
: : T1: X_lock B
: : T2: X_lock A
Unlock A Unlock A
Unlock B Unlock B
After T1 has locked A, T2 locks B and then T1 tries to lock B, but unable to do so and waits for T2 to
unlock B. Similarly, T2 tries to lock A but finds that it is held by T1 which has not yet unlocked it and
thus waits for T1 to unlock A. At this stage, neither T1 nor T2 can proceed since both of these
transactions are waiting for the other to unlock the locked resource. In such a situation, we say that a
deadlock has occurred, since two transactions are waiting for a condition that will never occur.
Deadlock detection :
Simple way to detect a state of deadlock is to draw a directed graph called a “wait for” graph. Wait
for graph is maintained by the lock manager of the DBMS. This graph G is defined by the pair (V,E)
where Vis a set of vertices/nodes and E is set of edges/arcs.Each transaction is represented by a node
and an arc from Ti->Tj, Ti is waiting for a data item/resource that is held by Tj.
When transaction Ti requests for a data item that is held by Tj then the edge Ti->Tj is inserted in the
“wait for” graph. This edge is removed when transaction Tj is no longer holding the data item needed
by transaction T1.A deadlock in the system of transaction occurs, if and only if the wait-for graph
contains a cycle. Deadlock can be detected by doing a periodic check for cycles in graph
T1 T2
In the above figure, T1 and T2 are the two transactions. T1 and T2 are waiting for each other to
unlock a resource held by the other, forming a cycle, causing a deadlock problem.
Deadlock conditions :
A deadlock occurs because of the following conditions :
• Mutual exclusion : states that at least one resource cannot be used by more than one process at a
time. The resources cannot be shared between processes. A resource can be locked in exclusive
mode by only one transaction at a time.
• Non-preemption : A data item can only be unlocked by the transaction that locked it. No other
transaction can unlock it.
• Partial allocation : A transaction can acquire locks on database in a piecemeal fashion.
• Circular waiting : transactions lock part of data resources needed and then wait indefinitely to
lock the resource currently locked by other transactions. states that one process is waiting for a
resource which is being held by second process and the second process is waiting for the third
process and so on and the last process is waiting for the first process. It makes a circular chain of
waiting.
To prevent deadlock, one has to ensure that at least one of these transactions does not occur.
Deadlock Prevention :
Deadlock can be prevented by having the basic logic : not to allow circular wait to occur. In this
approach, some of the transactions are rolled back instead of letting them wait.
• “Wait-die” scheme : Its based on non- preemptive technique. Its based on simple rule :
it is allowed to wait;
else Ti aborts
A timestamp may be defined as a sequence number that is unique for each transaction.
Therefore, a smaller timestamp means an older transaction.
For example, assume that three transactions T1, T2 and T3 where generated in that sequence,
then if T1 requests for a data item which is currently held by transactions T2, it is allowed to
wait as it has a smaller time stamping than that of T1. However, if T3 requests for a data item
which is currently held by transaction T2, then T3 is rolled back (die).
Wait Die
T1 T2 T3
“Wound wait” scheme : Its based on a preemptive technique. Its based on a simple rule :
it is allowed to wait;
For example, assume that there are three transactions T1, T2 and T3 were generated in that
sequence, then if T1 requests for a data item which is held by transaction T2, then T2 is rolled
back and data item is allotted to T1 as T1 has a smaller time stamping than that of T2. However,
if T3 requests for a data item which is currently held by transaction T2, then T3 is allowed to
wait.
T1 T2 T3
Wound T2 Wait
Whenever any transaction is rolled back, it would not make a starvation condition. Both “wait-
die” and “wound-wait” scheme avoid starvation.
The number of aborts and rollbacks will be higher in wait-die scheme than in the wound-wait
scheme.
Problem with these is that they may result in unnecessary rollbacks.
Differentiate between Wait die and wound
wait protocols.
Wait die Wound wait
It’s a non-preemptive technique It’s a preemptive technique.
When Transaction Ti requests a data item that When transaction Ti requests a data item that
is held by Tj, Ti is allowed to wait only if it has a is held by transaction Tj, Ti is allowed to wait
timestamp smaller than hat of Tj (i.e. Ti is older only if it has a timestamp larger than Tj,
than Tj), otherwise Ti is rolled back (dies). otherwise Tj is wounded up by Ti.
Number of aborts and roll backs is more. Number of aborts and roll backs is less.
Example :assume that three transactions T1, assume that there are three transactions T1,
T2 and T3 where generated in that sequence, T2 and T3 were generated in that sequence,
then if T1 requests for a data item which is then if T1 requests for a data item which is
currently held by transactions T2, it is allowed held by transaction T2, then T2 is rolled back
to wait as it has a smaller time stamping than and data item is allotted to T1 as T1 has a
that of T1. However, if T3 requests for a data smaller time stamping than that of T2.
item which is currently held by transaction T2, However, if T3 requests for a data item which
then T3 is rolled back (die). is currently held by transaction T2, then T3 is
allowed to wait.
What is optimistic scheduling ? Explain the
three phases of optimistic scheduling.
1. Read phase : A transaction T reads the data items from the database into its private
workspace. All the updates of the transaction can only change the local copies of the data in
the private workspace.
2. Validate phase : Checking is done to confirm whether the read values have changed during
the time transaction was updating the local values. This is performed by comparing the
current database values to the values that were read in the private workspace. In case, the
values have changed the local copies are thrown away and the transaction aborts.
3. Write phase : If validation phase is successful the transaction is committed and updates are
applied to the database, otherwise the transaction is rolled back.
• Timestamps : for each transaction T, the start time and the end time are kept for all the
three phases.
Example : In this case both T1 and T2 get committed. Read set of T1 and Read Set of T2 are both
disjoint, also the Write sets are also disjoint and thus no concurrency occurs.
Discuss the multiversion technique for
concurrency control.
A drawback of this is that more storage is needed to maintain multiple versions of the database
items.