Database Management Systems
(ITEC-212)
UNIT 3 – TRANSACTION PROCESSING
Unit Outline
◦ Introduction to Transaction Processing
◦ Transaction and System Concepts
◦ Desirable Properties of Transactions
Transaction
◦ In DBMS, a transaction refers to a logical unit of work that consists of one or more operations. It is
a sequence of operations that must be executed as a whole, ensuring data integrity and
consistency.
◦ A transaction can also be defined as a group of tasks, where a single task is the minimum
processing unit which cannot be divided further.
◦ Example: Suppose a bank employee transfers $500 from A's account to B's account. This very
simple and small transaction involves several low-level tasks.
A’s Account B’s Account
Open_Account(A) Open_Account(B)
Old_Balance = A.balance Old_Balance = B.balance
New_Balance = Old_Balance - 500 A.balance New_Balance = Old_Balance + 500 B.balance
Close_Account(A) Close_Account(B)
Access Operations in a Transaction
A logical unit of database processing may includes one or more of the following access operations. To
simplify our notation, we assume that the program variable is also named X.
read_item(X): Reads a database item named X into a program variable. It includes the following steps:
◦ Find the address of the disk block that contains item X.
◦ Copy that disk block into a buffer in main memory (if that disk block is not already in some main memory buffer).
◦ Copy item X from the buffer to the program variable named X.
write_item(X): Writes the value of program variable X into the database item named X. Its steps includes:
◦ Find the address of the disk block that contains item X.
◦ Copy that disk block into a buffer in main memory (if that disk block is not already in some main memory buffer).
◦ Copy item X from the program variable named X into its correct location in the buffer.
◦ Store the updated block from the buffer back to disk (either
Two sample transactions
Introduction to Transaction Processing
Single-User System
◦ In a Single-user system at most one user at a time can use the system.
Multi-User System
◦ In a Multi-user system many users can access the system concurrently.
Concurrency in DBMS
◦ Concurrency the ability of a DBMS to handle multiple transactions
simultaneously.
◦ It ensures that multiple users can access and modify the database
concurrently without interfering with each other's operations.
Concurrency in DBMS
Concurrency
◦ Concurrency is the ability of a DBMS to handle multiple transactions
simultaneously.
◦ It ensures that multiple users can access and modify the database concurrently
without interfering with each other's operations.
Concurrency control
◦ Mechanisms, such as locking and timestamp-based protocols, are implemented
to maintain data integrity and prevent conflicts between transactions.
◦ These mechanisms ensure that transactions are executed in a consistent and
isolated manner, allowing for efficient and reliable concurrent access.
Database State
In DBMS, the database state refers to the collection of all data stored in the database at a
particular point in time. It represents the current snapshot of the database, including all
records, tables, and relationships.
The state of the database can change as transactions are executed. When a transaction
modifies data, the database state is updated accordingly. The state of the database at any
given time is determined by the sequence of transactions that have been executed on it.
States of Transactions
A transaction in a database can be in one of the following states:
States of Transactions
Active: In this state, the transaction is being executed. This is the initial state of every transaction.
Partially Committed: When a transaction executes its final operation, it is said to be in a partially committed
state.
Failed: A transaction is said to be in a failed state if any of the checks made by the database recovery
system fails. A failed transaction can no longer proceed further.
Aborted: If any of the checks fails and the transaction has reached a failed state, then the recovery
manager rolls back all its write operations on the database to bring the database back to its original state
where it was prior to the execution of the transaction. Transactions in this state are called aborted. The
database recovery module can select one of the two operations after a transaction aborts:
◦ Re-start the transaction
◦ Kill the transaction
Committed: If a transaction executes all its operations successfully, it is said to be committed. All its effects
are now permanently established on the database system.
ACID
A transaction in a database system must maintain Atomicity, Consistency, Isolation, and
Durability, commonly known as ACID properties in order to ensure accuracy, completeness, and
data integrity.
Atomicity: This property states that a transaction must be treated as an atomic unit. It means
that it is either executed completely or none at all. There must be no state in a database where a
transaction is left partially completed.
Consistency: The database must remain in a consistent state after any transaction. No
transaction should have any adverse effect on the data residing in the database. If the database
was in a consistent state before the execution of a transaction, it must remain consistent after
the execution of the transaction as well.
ACID
Isolation: In a database system where more than one transaction are being executed
simultaneously and in parallel, the property of isolation states that all the transactions will be
carried out and executed as if it is the only transaction in the system. No transaction will affect
the existence of any other transaction.
Durability: The database should be durable enough to hold all its latest updates even if the
system fails or restarts. If a transaction updates a chunk of data in a database and commits, then
the database will hold the modified data. If a transaction commits but the system fails before
the data could be written on to the disk, then that data will be updated once the system comes
back into action.
Interleaved Processing vs Parallel Processing
Interleaved processing
◦ Interleaved processing refers to the execution of multiple transactions or processes in
an alternating or interleaved manner.
◦ Instead of executing one transaction in its entirety before moving on to the next,
interleaved processing allows breaking transactions into sub-transactions which can be
executed serially in a single CPU, thus creating the illusion of simultaneous execution.
Parallel processing
◦ Parallel processing involves executing multiple tasks or operations simultaneously on
multiple processors or computing resources.
◦ It divides tasks into smaller subtasks that can be processed concurrently onto multiple
CPU, leading to improved performance and scalability.
Serial Schedule vs Parallel schedule
Schedule: A chronological execution sequence of a transaction is called a schedule. A
schedule can have many transactions in it, each comprising of a number of
instructions/tasks.
Serial Schedule: It is a schedule in which transactions are aligned in such a way that one
transaction is executed first. When the first transaction completes its cycle, then the
next transaction is executed. Transactions are ordered one after the other. This type of
schedule is called a serial schedule, as transactions are executed in a serial manner.
Parallel schedule: It involves executing multiple transactions simultaneously, with their
operations being interleaved or executed in parallel. Parallel schedules can be achieved
using parallel processing by dividing the workload among multiple CPU or computing
resources.
Serializability
In DBMS, serializability refers to a property that ensures that the execution of concurrent
transactions produces the same result as if they were executed serially, one after another.
It guarantees that the final outcome of concurrent transactions is equivalent to some sequential
order of executing those transactions.
Advantages of Satiability
◦ Serializability is important for maintaining data consistency and integrity in a multi-user
environment.
◦ It prevents conflicts and ensures that the database remains in a valid state throughout
concurrent transaction execution.
Problems in Parallel Execution of Transactions
In a multi-transaction environment, serial schedules are considered as a benchmark.
The execution sequence of an instruction in a single transaction cannot be changed, but two
transactions can have their instructions executed in a random fashion. This execution does no
harm if two transactions are mutually independent and working on different segments of data;
but, in case these two transactions are working on the same data, then the results may vary.
This ever-varying result may bring the database to an inconsistent state.
To resolve this problem, we allow parallel execution of a transaction schedule, if its transactions
are either serializable or have some equivalence relation among them.
Equivalence Schedules
An equivalence schedule can be of the following types:
Result Equivalence
View Equivalence
Conflict Equivalence
Result Equivalence
If two schedules produce the same result after execution, they are said to be result equivalent.
They may yield the same result for some value and different results for another set of values.
That's why this equivalence is not generally considered significant.
View Equivalence
Two schedules would be view equivalence if the transactions in both the schedules perform
similar actions in a similar manner.
For example:
◦ If T reads the initial data in S1, then it also reads the initial data in S2.
◦ If T reads the value written by J in S1, then it also reads the value written by J in S2.
◦ If T performs the final write on the data value in S1, then it also performs the final write on the data
value in S2.
Conflict Equivalence
Two schedules would be conflicting if they have the following properties:
◦ Both belong to separate transactions.
◦ Both accesses the same data item.
◦ At least one of them is "write" operation.
Two schedules having multiple transactions with conflicting operations are said to be conflict
equivalent if and only if:
◦ Both the schedules contain the same set of Transactions.
◦ The order of conflicting pairs of operation is maintained in both the schedules.
Note: View equivalent schedules are view serializable and conflict equivalent schedules are
conflict serializable. All conflict serializable schedules are view serializable too.