UNIT THREE DBMS NOTES-1
UNIT THREE DBMS NOTES-1
UNIT – III
Structured Query Language: Basic structure of SQL queries, Examples of basic SQL queries, set
operations, Aggregate functions, Logical connectivity’s(AND, OR, NOT), comparison operators,
comparison using NULL values, Integrity constraints over relations, enforcing integrity constraints,
disallowing null values, introduction to nested queries, outer joins, creating, altering and destroying
views, triggers.
COURSE OBJECTIVES:
• To get familiar with fundamental concepts of database management such as database design,
database languages, and database-system implementation
COURSE OUTCOMES:
• Develop the knowledge of fundamental concepts of database management systems.
SQL is a database computer language designed for managing data in relational database
management systems (RDBMS), and originally based upon Relational Algebra. Its scope includes
data query and update, schema creation and modification, and data access control. SQL was one of
the first languages for Edgar F. Codd's relational model in his influential 1970 paper, "A Relational
Model of Data for Large Shared Data Banks" and became the most widely used language for
relational databases.
• IBM developed SQL in mid of 1970’s.
• Oracle incorporated in the year 1979.
• SQL used by IBM/DB2 and DS Database Systems.
• SQL adopted as standard language for RDBS by ASNI in 1989.
The basic use of SQL for data professionals and SQL users is to insert, update, and
1
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
• It allows SQL users to create, drop, and manipulate the database and its tables.
• It also helps in creating the view, stored procedure, and functions in the relational database.
• It allows you to define the data and modify that stored data in the relational database.
• It also allows SQL users to set the permissions or constraints on table columns, views, and
stored procedures.
CREATE TABLE
CREATE TABLE creates a new table inside a database. The terms int and varchar(255)
in this example specify the datatypes of the columns we're creating.
CREATE INDEX
CREATE INDEX generates an index for a table. Indexes are used to retrieve data from a database faster.
2
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
DROP
DROP statements can be used to delete entire databases, tables or indexes.
It goes without saying that the DROP command should only be used where absolutely necessary.
DROP TABLE
DROP TABLE deletes a table as well as the data within it.
DROP INDEX
DROP INDEX deletes an index within a database.
UPDATE customers
SET age = 56
WHERE name = ‘Bob’;
DELETE
DELETE can remove all rows from a table (using ), or can be used as part of a WHERE clause to
delete rows that meet a specific condition.
2. If you are adding values for all the columns of the table, you do not need to specify the column
names in the SQL query. However, make sure the order of the values is in the same order as the
columns in the table. Here, the INSERT INTO syntax would be as follows:
3
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
SELECT
SELECT is probably the most commonly-used SQL statement. You'll use it pretty much every
time you query data with SQL. It allows you to define what data you want your query to return.
For example, in the code below, we’re selecting a column called name from a table called customers.
SELECT *
SELECT used with an asterisk (*) will return all of the columns in the table we're querying.
SELECT DISTINCT
SELECT DISTINCT only returns data that is distinct — in other words, if there are duplicate records,
it will return only one copy of each.
The code below would return only rows with a unique name from the customers table.
The code below would return the top 50 results from the customers table:
AS
AS renames a column or table with an alias that we can choose. For example, in the code
below, we’re renaming the name column as first_name:
FROM
FROM specifies the table we're pulling our data from:
WHERE
WHERE filters your query to only return results that match a set condition. We can use this
together with conditional operators like =, >, <, >=, <=, etc.
AND
AND combines two or more conditions in a single query. All of the conditions must be
met for the result to be returned.
4
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
OR
OR combines two or more conditions in a single query. Only one of the conditions must be
met for a result to be returned.
BETWEEN
BETWEEN filters your query to return only results that fit a specified range.
LIKE
LIKE searches for a specified pattern in a column. In the example code below, any row
with a name that included the characters Bob would be returned.
IN
IN allows us to specify multiple values we want to select for when using the WHERE command.
IS NULL
IS NULL will return only rows with a NULL value.
SELECT name FROM customers WHERE name IS NULL;
IS NOT NULL
IS NOT NULL does the opposite — it will return only rows without a NULL value.
SET OPERATIONS
SET operators are special type of operators which are used to combine the result of two queries.
Operators covered under SET operators are:
• UNION
• UNION ALL
• INTERSECT
• MINUS
5
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
There are certain rules which must be followed to perform operations using SET operators in SQL.
Rules are as follows:
• The number and order of columns must be the same.
• Data types must be compatible.
UNION
Union is used to combine the results of two or more SELECT statements. However it will eliminate
duplicate rows from its resultset. In case of union, number of columns and datatype must be same in both
the tables, on which UNION operation is being applied.
Example of UNION
The First table,
ID Name
1 abhi
2 adam
ID Name
2 adam
3 Chester
ID NAME
1 abhi
2 adam
3 Chester
UNIONALL
This operation is similar to Union. But it also shows the duplicate rows.
ID NAME
1 abhi
2 adam
The Second table,
ID NAME
2 adam
3 Chester
6
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
ID NAME
1 abhi
2 adam
2 adam
3 Chester
INTERSECT
Intersect operation is used to combine two SELECT statements, but it only retuns the records which are
common from both SELECT statements. In case of Intersect the number of columns and datatype must
be same.
Example of Intersect
The First table,
ID NAME
1 abhi
2 adam
The Second table,
ID NAME
2 adam
3 Chester
ID NAME
2 adam
MINUS
Minus operation combines results of two SELECT statements and return only those in the final result,
which belongs to the first set of the result.
Example of Minus
The First table,
ID NAME
1 abhi
2 adam
7
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
ID NAME
2 adam
3 Chester
ID NAME
1 abhi
AGGREGATE FUNCTIONS
SQL Aggregate functions are functions where the values of multiple rows are grouped as input on
certain criteria to form a single value result of more significant meaning.
SQL Aggregate functions are mostly used with the GROUP BY clause of the SELECT statement.
1. Count()
2. Sum()
3. Avg()
4. Min()
5. Max()
Example:
--Count the number of employees
SELECT COUNT(*) AS TotalEmployees FROM Employee;
8
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
SQL logical operators are used to test for the truth of the condition. A logical operator like the
Comparison operator returns a boolean value of TRUE, FALSE, or UNKNOWN. In this article, we will
discuss different types of Logical Operators.
Logical operators are used to combine or manipulate the conditions given in a query to retrieve or
manipulate data. There are some logical operators in SQL like OR, AND, NOT.
AND operator
When multiple conditions are combined using the AND operator, all rows which meet all of the given
conditions will be returned.
Example: SELECT * FROM members WHERE Age < 50 AND Location = 'Los Angeles';
OR operator
When multiple conditions are combined using the OR operator, all rows which meet any of the given
conditions will be returned
Example: SELECT * FROM members WHERE Location = 'Los Angeles' OR LastName = 'Hanks'
NOT operator
When multiple conditions are combined using the NOT operator, all rows which do not meet the given
conditions will be returned.
Example:SELECT * FROM members WHERE NOT Location = 'Los Angeles'
COMPARISON OPERATORS
The comparison operators in SQL are categorized into the following six operators category:
Examples:
The NULLs in SQL play an essential role in defining the ‘unknown values’ from the database.
In SQL, almost every database contains such types of values to represent the empty spaces. The
comparison operators ( =, <>, <, > ) are widely used in SQL queries to solve complex models.
9
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
These comparison operators cannot be used directly with the NULL values as it will result in an
unknown value. There are special operators to handle the NULL values with comparison
operators that are ‘IS NULL’ and ‘IS NOT NULL‘ operators.
IS NULL: The IS NULL operator is used to check if the values are NULL.
IS NOT NULL: The IS NOT NULL operator is used to check if the values are not NULL.
Example:
CREATE TABLE Employees (
EmployeeID INT,
Name VARCHAR(50),
Department VARCHAR(50),
Salary INT
);
10
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
Domain Constraints
These are defined as the definition of valid set of values for an attribute. The data type of
domain include string, char, time, integer, date, currency etc. The value of the attribute must be
available in comparable domains.
Not-Null Constraints
It specifies that within a tuple, attributes overs which not-null constraint is specified must not
contain any null value.
Key Constraints
Keys are the entity set that are used to identify an entity within its entity set uniquely. An entity
set can contain multiple keys, bit out of them one key will be primary key. A primary key is
always unique, it does not contain any null value in table.
Primary Key Constraints
It states that the primary key attributes are required to be unique and not null. That is, primary
key attributes of a relation must not have null values and primary key attributes of two tuples
must never be same. This constraint is specified on database schema to the primary key
attributes to ensure that no two tuples are same.
Example:
11
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
5. Default: SQL allows a default value to be specified for an attribute as illustrated by the
following
create table statement:
create table student (ID number (5), name varchar (20) not null, deptname varchar (20),
totcred number (3) default 0, primary key (ID));
The default value of the totcred attribute is declared to be 0. As a result, when a tuple is inserted
into the
student relation, if no value is provided for the totcred attribute, its value is set to 0. The
following insert
statement illustrates how an insertion can omit the value for the tot cred attribute.
insert into student(ID, name, deptname) values (’12789’, ’Newman’, ’Comp. Sci.’);
6. Referential Integrity: Referential Integrity rule in DBMS is based on Primary and Foreign
Key. The Rule defines that a foreign key have a matching primary key. Reference from a table
12
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
to another table should be valid. For example, the dept_name in the Course table has a
matching valid dept_name in the Department table.
Ex: Create table Course (course_id varchar (8), course_name varchar (25), dept_name varchar
(20),
credits number (3) check (credits > 0), primary key (course_id), foreign key (dept_name)
references department)
There may be an instance where you need to make a column non-nullable that already contains
NULL values. If we try to make a column non-nullable while there are presently NULL values
in said column, we’ll get an error message
Before any changes are made to your table, it’s important to briefly go over what data can (and
cannot) be specified within an existing column that you wish to alter to NOT NULL, ensuring
that no row is allowed to have a NULL value in that column.
Most critically, all existing NULL values within the column must be updated to a non-null
value before the ALTER command can be successfully used and the column made NOT
NULL. Any attempt to set the column to NOT NULL while actual NULL data remains in the
column will result in an error and no change will occur.
Example:
UPDATE clients SET phone = '0-000-000-0000' WHERE phone IS NULL;
Nested queries, also known as subqueries, are a powerful feature in SQL that allows you to
place one query inside another query. This can help you perform more complex data retrieval
tasks that might be difficult or impossible with a single query.
Types of Nested Queries
1. Independent Nested Queries: These are executed from the innermost query to the outermost
query. The inner query runs independently of the outer query, but its result is used by the outer
query. For example:
2. Correlated Nested Queries: These depend on the outer query for their values. The inner query
is executed once for each row processed by the outer query. For example:
13
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
Simplify Complex Queries: Break down complex problems into simpler sub-tasks.
Enhanced Data Filtering: Perform operations that require multiple steps of filtering and
aggregation.
Reusability: Use the results of one query as the input for another, making your SQL code more
modular and easier to maintain.
OUTER JOINS
In a relational DBMS, we follow the principles of normalization that allows us to minimize the
large tables into small tables. By using a select statement in Joins, we can retrieve the big table
back. Outer joins are of following three types.
1.Left Outer Join : The left join operation returns all record from left table and matching
records from the right table. On a matching element not found in right table, NULL is
represented in that case.
Syntax:
SELECT column_name(s) FROM table1 LEFT JOIN Table2 ON
Table1.Column_Name=table2.column_name;
2. Right Outer Join : The right join operation returns all record from right table and matching
records from the left table. On a matching element not found in left table, NULL is represented
in that case.
Syntax:
SELECT column_name(s)FROM table1
RIGHT JOIN table2 ON table1.column_name = table2.column_name;
3. Full Outer Join : The full outer Join keyword returns all records when there is a match
in left or right table records.
14
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
Syntax:
SELECT column_name FROM table1
FULL OUTER JOIN table2 ON table1.columnName = table2.columnName
WHERE condition;
Views in SQL are a kind of virtual table. A view also has rows and columns like tables, but a
view doesn’t store data on the disk like a table. View defines a customized query that retrieves
data from one or more tables, and represents the data as if it was coming from a single source.
We can create a view by selecting fields from one or more tables present in the database. A View
can either have all the rows of a table or specific rows based on certain conditions.
We can create a view using CREATE VIEW statement. A View can be created from
a single table or multiple tables.
The ALTER VIEW statement in MySQL is used to modify the definition of an existing view.
It allows us to change the query or structure of a view without dropping and recreating it.
Users must have the ALTER and CREATE VIEW privileges to use this statement.
When we use ALTER VIEW, the existing view is replaced entirely.
We cannot modify specific columns in a view. However, it cannot be used to change
the view’s name or its underlying table.
Drop VIEW
We can also DROP the view (myView) when not in use anymore using drop command:
16
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
TRIGGERS
Trigger is a statement that a system executes automatically when there is any modification to
the database. In a trigger, we first specify when the trigger is to be executed and then the
action to be performed when the trigger executes. Triggers are used to specify certain
integrity constraints and referential constraints that cannot be specified using the constraint
mechanism of SQL.
1.create table employ (employee_id int, salary decimal(8,2), primary key (employee_id));
BEGIN
sal_diff := :NEW.salary - :OLD.salary;
osal:= :OLD.salary;
nsal:= :NEW.salary;
id:= :NEW.employee_id;
FUNCTIONAL DEPENDENCIES
A functional dependency occurs when one attribute uniquely determines another attribute
within a relation. It is a constraint that describes how attributes in a table relate to each other.
If attribute A functionally determines attribute B we write this as the A→B.
Functional dependencies are used to mathematically express relations among database entities
and are very important to understanding advanced concepts in Relational Database Systems.
Example
42 abc CO A4
43 pqr IT A3
44 xyz CO A4
45 xyz IT A3
46 mno EC B2
47 jkl ME B2
From the above table we can conclude some valid functional dependencies:
roll_no → { name, dept_name, dept_building },→ Here, roll_no can determine values of fields
name, detonate and dept_building, hence a valid Functional dependency
name → dept_name Students with the same name can have different dept_name, hence this is
not a valid functional dependency.
18
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
3.Transitivity: If X → Y and Y → Z are both valid dependencies, then X→Z is also valid
by the Transitivity rule.
Example, roll_no → dept_name & dept_name → dept_building, then roll_no →
dept_building is also valid.
42 abc 17
43 pqr 18
44 xyz 18
42 abc 17
43 pqr 18
44 xyz 18
19
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
Here, roll_no → name is a non-trivial functional dependency, since the dependent name is not a
subset of determinant roll_no. Similarly, {roll_no, name} → age is also a non-trivial
functional dependency, since age is not a subset of {roll_no, name}
3. Multivalued Functional Dependency
In Multivalued functional dependency, entities of the dependent set are not dependent on each
other. i.e. If a → {b, c} and there exists no functional dependency between b and c, then it is
called a multivalued functional dependency.
For example,
42 abc 17
43 pqr 18
44 xyz 18
45 abc 19
42 abc CO 4
43 pqr EC 2
44 xyz IT 1
45 abc EC 2
Here, enrol_no → dept and dept → building_no. Hence, according to the axiom of
transitivity, enrol_no → building_no is a valid functional dependency. This is an indirect
functional dependency, hence called Transitive functional dependency.
5. Fully Functional Dependency
In full functional dependency an attribute or a set of attributes uniquely determines another
attribute or set of attributes. If a relation R has attributes X, Y, Z with the dependencies X ->Y
20
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
and X->Z which states that those dependencies are fully functional.
6. Partial Functional Dependency
In partial functional dependency a non key attribute depends on a part of the composite key, rather
than the whole key. If a relation R has attributes X, Y, Z where X and Y are the composite key
and Z is non key attribute. Then X->Z is a partial functional dependency in RBDMS.
Inference Rules
There are 6 inference rules, which are defined below:
• Reflexive Rule: According to this rule, if B is a subset of A then A logically determines B.
Formally, B ⊆ A then A → B.
o Example: Let us take an example of the Address (A) of a house, which contains so
many parameters like House no, Street no, City etc. These all are the subsets of A. Thus,
address (A) → House no. (B).
• Augmentation Rule: It is also known as Partial dependency. According to this rule, If A
logically determines B, then adding any extra attribute doesn’t change the basic functional
dependency.
o Example: A → B, then adding any extra attribute let say C will give AC → BC and
doesn’t make any change.
• Transitive rule: Transitive rule states that if A determines B and B determines C, then it can
be said that A indirectly determines B.
o Example: If A → B and B → C then A → C.
• Union Rule: Union rule states that If A determines B and C, then A determines BC.
o Example: If A → B and A → C then A → BC.
• Decomposition Rule: It is perfectly reverse of the above Union rule. According to this rule, If
A determined BC then it can be decomposed as A → B and A → C.
o Example: If A → BC then A → B and A → C.
• Pseudo Transitive Rule: According to this rule, If A determined B and BC determines D then
BC determines D.
o Example: If A → B and BC → D then AC → D.
ANOMALIES
Anomaly means inconsistency in the pattern from the normal form. In Database Management
System (DBMS), anomaly means the inconsistency occurred in the relational table during the
operations performed on the relational table.
There can be various reasons for anomalies to occur in the database. For example, if there is a lot
of redundant data present in our database then DBMS anomalies can occur. If a table is
constructed in a very poor manner then there is a chance of database anomaly. Due to database
anomalies, the integrity of the database suffers.
Example 1:
Worker_id Worker_name Worker_dept Worker_address
21
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
NORMALIZATION
Database Normalization is any systematic process of organizing a database schema such that
no data redundancy occurs and there is least or no anomaly while performing any update operation
on data. In other words, it means dividing a large table into smaller pieces such that data
redundancy should be eliminated. The normalizing procedure depends on the functional
dependencies among the attributes inside a table and uses several normal forms to guide the design
process.
If a relation contain composite or multi-valued attribute, it violates first normal form or a relation is
in first normal form if it does not contain any composite or multi-valued attribute. A relation is in
first normal form if every attribute in that relation is singled valued attribute.
• Example 1 – Relation STUDENT in table 1 is not in 1NF because of multi-valued attribute
STUD_PHONE. Its decomposition into 1NF has been shown in table 2.
22
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
A relation is in 2NF if it is in 1NF and any non-prime attribute (attributes which are not part of any
candidate key) is not partially dependent on any proper subset of any candidate key of the table. In
other words, we can say that, every non-prime attribute must be fully dependent on each candidate
key.
A functional dependency X->Y (where X and Y are set of attributes) is said to be in partial
dependency, if Y can be determined by any proper subset of X.
However, in 2NF it is possible for a prime attribute to be partially dependent on any candidate key,
but every non-prime attribute must be fully dependent(or not partially dependent) on each candidate
key of the table.
TEACHER table
TEACHER_ID SUBJECT TEACHER_AGE
25 Chemistry 30
25 Biology 30
47 English 35
83 Math 38
83 Computer 38
In the given table, non-prime attribute TEACHER_AGE is dependent on TEACHER_ID which is a
proper subset of a candidate key. That's why it violates the rule for 2NF.
To convert the given table into 2NF, we decompose it into two tables:
TEACHER_DETAIL table:
TEACHER_ID TEACHER_AGE
25 30
23
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
47 35
83 38
TEACHER_SUBJECT table:
TEACHER_ID SUBJECT
25 Chemistry
25 Biology
47 English
83 Math
83 Computer
To determine the highest normal form of a given relation R with functional dependencies, the first
step is to check whether the BCNF condition holds. If R is found to be in BCNF, it can be safely
deduced that the relation is also in 3NF, 2NF, and 1NF as the hierarchy shows. The 1NF has the
least restrictive constraint – it only requires a relation R to have atomic values in each tuple. The
2NF has a slightly more restrictive constraint.
The 3NF has a more restrictive constraint than the first two normal forms but is less restrictive than
the BCNF. In this manner, the restriction increases as we traverse down the hierarchy.
Example 1
Electronics &
102 Communication VLSI Technology B_003 401
Engineering
Electronics &
Mobile
102 Communication B_003 402
Engineering Communication
25
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
Stu_Branch Table
Stu_ID Stu_Branch
101 201
101 202
102 401
102 402
26
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
The Fourth Normal Form (4NF) is a level of database normalization where there are no non-trivial
multivalued dependencies other than a candidate key. It builds on the first three normal forms (1NF,
2NF, and 3NF) and the Boyce-Codd Normal Form (BCNF). It states that, in addition to a database
meeting the requirements of BCNF, it must not contain more than one multivalued dependency.
Properties
A relation R is in 4NF if and only if the following conditions are satisfied:
1. It should be in the Boyce-Codd Normal Form (BCNF).
2. The table should not have any Multi-valued Dependency.
A table with a multivalued dependency violates the normalization standard of the Fourth Normal
Form (4NF) because it creates unnecessary redundancies and can contribute to inconsistent data. To
bring this up to 4NF, it is necessary to break this information into two tables.
Example: Consider the database table of a class that has two relations R1 contains student ID(SID)
and student name (SNAME) and R2 contains course id(CID) and course name (CNAME).
Table R1
SID SNAME
S1 A
S2 B
Table R2
CID CNAME
C1 C
C2 D
S1 A C1 C
S1 A C2 D
S2 B C1 C
S2 B C2 D
27
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
A relation R is in Fifth Normal Form if and only if everyone joins dependency in R is implied by the
candidate keys of R. A relation decomposed into two relations must have lossless join Property,
which ensures that no spurious or extra tuples are generated when relations are reunited through a
natural join.
Properties
A relation R is in 5NF if and only if it satisfies the following conditions:
1. R should be already in 4NF.
2. It cannot be further non loss decomposed (join dependency).
Example – Consider the above schema, with a case as “if a company makes a product and an agent
is an agent for that company, then he always sells that product for the company”. Under these
circumstances, the ACP table is shown as:
Table ACP
Agent Company Product
A1 PQR Nut
A1 PQR Bolt
A1 XYZ Nut
A1 XYZ Bolt
A2 PQR Nut
The relation ACP is again decomposed into 3 relations. Now, the natural Join of all three relations
will be shown as:
Table R1
Agent Company
A1 PQR
A1 XYZ
A2 PQR
Table R2
Agent Product
A1 Nut
28
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
Agent Product
A1 Bolt
A2 Nut
Table R3
Company Product
PQR Nut
PQR Bolt
XYZ Nut
XYZ Bolt
The result of the Natural Join of R1 and R3 over ‘Company’ and then the Natural Join of R13 and
R2 over ‘Agent’and ‘Product’ will be Table ACP.
Hence, in this example, all the redundancies are eliminated, and the decomposition of ACP is a
lossless join decomposition. Therefore, the relation is in 5NF as it does not violate the property
of lossless join.
29
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
DENORMALIZATION
Denormalization is a database optimization technique in which we add redundant data to one or more
tables. This can help us avoid costly joins in a relational database. Note that denormalization does not mean
‘reversing normalization’ or ‘not to normalize’. It is an optimization technique that is applied after
normalization.
Basically, The process of taking a normalized schema and making it non-normalized is called
denormalization, and designers use it to tune the performance of systems to support time-critical
operations.
In a traditional normalized database, we store data in separate logical tables and attempt to minimize
redundant data. We may strive to have only one copy of each piece of data in a database.
For example, in a normalized database, we might have a Courses table and a Teachers table. Each
entry in Courses would store the teacherID for a Course but not the teacherName. When we need to
retrieve a list of all Courses with the Teacher’s name, we would do a join between these two tables.
In some ways, this is great; if a teacher changes his or her name, we only have to update the name in
one place.
The drawback is that if tables are large, we may spend an unnecessarily long time doing joins on
tables.
Denormalization, then, strikes a different compromise. Under denormalization, we decide that we’re
okay with some redundancy and some extra effort to update the database in order to get the
efficiency advantages of fewer joins.
30
UNIT-3-Lecture Notes for BE CSE(DS) III SEM DBMS
Pros of Denormalization:
1. Retrieving data is faster since we do fewer joins
2. Queries to retrieve can be simpler(and therefore less likely to have bugs),
since we need to look at fewer tables.
Cons of Denormalization:
1. Updates and inserts are more expensive.
2. Denormalization can make update and insert code harder to write.
3. Data may be inconsistent.
4. Data redundancy necessitates more storage.
31