Quality Content for Outcome based Learning
Normalization
Unit-5
Ver. No.: 1.1 Copyright © 2021, ABES Engineering College
Introduction We
know
ER model helps the database designer identify entity types, their
attributes and relationship between entity types
This leads to a natural and logical grouping of attributes into relations
Each relation schema consists of several attributes
A relational database schema consists of several relation schemas
We need
to know
• Some formal way of analysing why one grouping of attributes into a
relation schema may be better than another
• Measure of appropriateness or goodness to measure the quality of the
design, other than the designer's intuition
Copyright © 2021, ABES Engineering College
Good Quality Database Design Goals
The implicit goals of the design activity are information preservation
and redundancy minimization
Information preservation: maintaining all concepts, including entity types, attribute
types, relationship types, and generalization/specialization relationships, which are
described using the ER model
Redundancy minimization: minimizing redundancy implies minimizing redundant
storage of the same data and reducing the need for multiple updates to maintain
consistency across multiple copies of the same information in response to real-world
events requiring an update
Copyright © 2021, ABES Engineering College
Data redundancy and associated issues
Data redundancy occurs when the same piece of data is stored in two or
more separate places.
Suppose we create a Relation to store sales records, and in the records for
each sale, we enter the customer address as one of the attributes. Now we
have multiple sales to the same customer, so the same address is entered
multiple times. The address that is repeatedly entered is redundant data.
Data redundancy normally happens when we try to combine attributes from
multiple entity types and relationship types into a single Relation
Copyright © 2021, ABES Engineering College
Relation schema design issues
To understand design issues and the problems associated with it, let's take an example of Relation
FACULTY_DETAIL that stores all the faculty attributes and the department they work for. The
department data is not stored separately.
In the above Relation FACULTY_DETAIL, we have redundant data in the column – dept_location. For
each faculty, while specifying its department, dept_location information needs to be repeated. This
Relation suffers from insertion, updation, and deletion anomalies.
Copyright © 2021, ABES Engineering College
Relation schema design issues…
Insertion Anomaly:- College starts a new department (CSE-DS at Bhabha Block) that is yet to have any faculty.
We cannot store the data of this new department in the above Relation and the faculty_Id being a primary key
cannot be NULL for a record/tuple.
Updation Anomaly:- Suppose the location of a department is changed. The new location needs to be updated for
this particular department in all the rows/tuples where it appears. While carrying this updation process, if we miss
any row/tuple where this department appears, this department's data will be inconsistent in the Relation.
Deletion Anomaly:- If faculty_Id - 2765 (Girish) leaves the college and his record is deleted from the database.
We can see that he is the only faculty in the ME department. The moment we delete faculty_Id - 2765 record/tuple
from the Relation, ME department information is also lost.
Copyright © 2021, ABES Engineering College
Relation schema design issues…
If we decompose the above FACULTY_DETAIL Relation into two separate relations, say faculty and department, we
will eliminate the design issues and related anomalies discussed earlier.
This process of eliminating the relation design issues and
related anomalies is called Normalization
Copyright © 2021, ABES Engineering College
Normalization
As discussed, the process of eliminating the relation design issues
(mainly data redundancy) and related anomalies is called Normalization
When we convert the ER model into a relational model, in most cases,
substantial normalization is already achieved by virtue of implicit and
explicit constraints discussed in earlier units. However, we will discuss all
the normal forms in detail to understand the normalization process.
Copyright © 2021, ABES Engineering College
First Normal Form (1NF)
The first normal form (1NF), imposes a fundamental requirement on
relations.
We say that a relation schema R is in first normal form (1NF) if the domains of all
attributes of R are atomic.
A domain of an attribute is atomic if elements of the domain are considered to be
indivisible units.
It means that multivalued attributes, composite attributes, and their combinations
are not allowed in a Relation that is in first normal form
Copyright © 2021, ABES Engineering College
First Normal Form (1NF)…
Multivalued attribute: A multivalued attribute may have one or more values for a particular entity.
Example – Phone Number. In our SMS case study, the phone number attribute in the STUDENT entity
type is a multivalued attribute. It means that a student can have multiple phone numbers. If you
remember, this also comes from the implicit constraint applied to relational databases.
Composite attribute: Composite attributes are not atomic because they are assembled using some
other atomic attributes. A typical example of a composite attribute is a person's address, composed of
atomic attributes, such as House No., Street, City, State, Pincode.
In the case of a composite attribute, we can still store it in the database without violating any database
constraint; however, it is not a good database design. Storing a composite attribute in the database will make
data querying and analysis on its constituent atomic attributes very complex. It can also result in the
redundancy of data.
Copyright © 2021, ABES Engineering College
First Normal Form (1NF)…
For handling a Composite attribute we need to create a separate column for each part of the
composite attribute, as number of parts in a composite attribute will be fixed for most of the cases.
Copyright © 2021, ABES Engineering College
First Normal Form (1NF)…
For handling a multivalued attribute, we have the three options:-
Option 1:
Expand the Key of this Relation to include phone_no with roll_no. The Relation will now
have a composite primary key consisting of roll_no & phone_no. This arrangement achieves
the first normal form (1NF); however, it is not a good design as it introduces data
redundancy into the Relation. For each additional phone number of a student, the data in
other columns is repeated.
Copyright © 2021, ABES Engineering College
First Normal Form (1NF)…
Option 2:
Suppose the maximum number of values is known for phone_no, as many columns can be
added to the existing Relation.
Let's assume a student can have a maximum of two phone_no. We can create the below
relation design, with two separate columns to store two possible student phone numbers to
achieve the first normal form (1NF). This is not a good design as it limits the phone numbers
a student can have. If we want to allow more phone numbers, the relation design would
need a change, which is not a good design practice.
Copyright © 2021, ABES Engineering College
First Normal Form (1NF)…
Option 3:
Decompose this Relation into two relations – STUDENT & STUDENT_PHONE_NO. They
are linked to each other with the Primary Key (PK) - Foreign Key (FK) relationship. This is a
good design as it takes care of data redundancy and does not limit the number of phone
numbers a student can have.
Copyright © 2021, ABES Engineering College
Second Normal Form (2NF)
The Second Normal Form (2NF) is based on the concept of full functional
dependency.
The Second Normal Form applies to relations with composite keys, that
is, relations with a primary key composed of two or more attributes.
A Relation with a single-attribute primary key is automatically in at least
2NF. A Relation not in 2NF may suffer from inconsistency problems
arising during insert, delete and update operations.
Copyright © 2021, ABES Engineering College
Second Normal Form (2NF)…
Definition:
For a Relation to be in 2NF, it should fulfill the below two conditions:
The Relation should be in 1NF
The Relation should have No Partial Dependency, i.e., no non-prime attribute (attributes that are not part
of any Primary/candidate key) is dependent on any proper subset of any candidate key of the Relation.
How to check:
2NF applies to relations with composite candidate keys. A Relation with a single-attribute candidate Keys is
automatically in at least 2NF.
Proper Subset (CK/PK) → any non-prime attribute should not hold.
How to convert 1NF to 2NF:
The normalization of 1NF relations to 2NF involves the removal of partial functional dependencies. If a partial
dependency exists, we remove partially dependent attribute(s) (along with their dependents, if any) from the
Relation by placing them in a new Relation along with a copy of their determinant. The remaining attributes of
the Relation along with the determinant above remain part of the base Relation.
Copyright © 2021, ABES Engineering College
Second Normal Form (2NF)…
Example 1: Let's assume a school can store the data of teachers and the subjects they teach. In a
school, a teacher can teach more than one subject.
The FDs in the Relation teacher_id → teacher_age,
can be depicted as: Relation (ABC) with FD = A→C
Let's find the candidate key of the above Relation.
Candidate Key is (AB). Prime Attributes – A, B. Non-
prime Attributes – C
We have a composite candidate key (AB), and its
proper subset (A) can determine a non-prime
attribute (C), FD (A→C). So this is a case of partial
dependency. Therefore the Relation is not in 2NF.
To convert this Relation into 2NF, we need
to remove the partially dependent
attribute(s) from the Relation by placing
them in a new Relation along with a copy
of their determinant.
Copyright © 2021, ABES Engineering College
Second Normal Form (2NF)…
Example 2: In the previous section, when we converted the Relation into 1NF using option 1. (roll_no
& phone_no) is the composite primary key
Now let's analyze this Relation from a functional dependency point of view and find out if this is in 2NF
or not. We can re-write the above as Relation R(ABCDEFGHIJKL) with FDs = A→BCDEFGHIJK, I→J
Candidate Keys is (AL). Prime Attributes – A, L. Non-prime Attributes – B, C, D, E, F, G, H, I, J, K
We have a composite candidate key (AL), and its proper subset (A) can determine non-prime attributes
(B, C, D, E, F, G, H, I, J, K), FD (A→BCDEFGHIJK). So this is a case of partial dependency. Therefore
the Relation is not in 2NF.
Copyright © 2021, ABES Engineering College
Second Normal Form (2NF)…
Example 2 (contd.): To convert this Relation into 2NF, we need to remove the partially dependent
attribute(s) from the Relation by placing them in a new relation along with a copy of their determinant.
Copyright © 2021, ABES Engineering College
Second Normal Form (2NF)…
Example 3:
Let's take Relation R(A,B,C,D,E,F) with FD set = (A→B, B→C, C→D, D→E). Let's find if this Relation
is in 2NF or not.
The candidate key of the above Relation is (A). As the candidate key is not composite, the case of
partial dependency does not arise. Therefore the Relation is in 2NF.
Example 4:
Let’s take Relation R(A,B,C,D) with FD set = (AB→CD, C→A, D→B). Let's find if this Relation is in
2NF or not.
The candidate keys of the above Relation are (AB), (BC), (CD), (AD).
Prime Attributes – A, B, C, D. Non-prime Attributes – NILL
In this case, though, we have composite candidate keys but no non-prime attribute. So the case of
partial dependency does not arise. Therefore the Relation is in 2NF.
Copyright © 2021, ABES Engineering College
Second Normal Form (2NF)…
Example 5:
Let’s take Relation R(A,B,C,D) with FD set = (A→B, B→D). Let's find if this Relation is in 2NF or not.
The candidate key of the above Relation is (AC).
Prime Attributes – A, C
Non-prime Attributes – B, D
In this case, we have a composite candidate key (AC), and its proper subset (A) can determine a non-
prime attribute (B), FD (A→B). So this is a case of partial dependency. Therefore the Relation is not in
2NF.
Copyright © 2021, ABES Engineering College
Third Normal Form (3NF)
Although Second Normal Form (2NF) relations have less redundancy than
those in 1NF, they may still suffer from inconsistency problems arising during
insert, delete and update operations.
A transitive dependency causes these inconsistency problems. Transitive
dependency causes redundancy in the Relation. We need to remove
such dependencies by progressing to the Third Normal Form (3NF).
Copyright © 2021, ABES Engineering College
Third Normal Form (3NF)…
Definition:
For a Relation to be in 3NF, it should fulfill both the below two conditions
The Relation should be in 2NF
There should be no non-prime attribute that is transitively dependent on the primary key
or any candidate key
or
A non-prime attribute should not functionally depend on the other non-prime attribute.
This means if we have a Relation R(A,B,C,D) with FDs = A→BD, B→C. In this Relation, (A) is
the candidate key and we have a transitive dependency, A→B, B→C.
We have a non-prime attribute (C) that is transitively dependent on candidate key (A), therefore
this Relation is not in 3NF or we can say, we have a non-prime attribute (C) which is dependent
on another non-prime attribute (B); hence the Relation is violating the 3NF condition.
Copyright © 2021, ABES Engineering College
Third Normal Form (3NF)…
How to check:-
A Relation is in 3NF if at least one of the following condition holds in every non-trivial
function dependency X→Y:
X is a super key
Y is a prime attribute
How to convert 2NF to 3NF:-
The normalization of 2NF relations to 3NF involves the removal of transitive dependencies.
If a transitive dependency exists, we remove transitively dependent attribute(s) from the
Relation by placing the attribute(s) in a new Relation along with a copy of the determinant.
The remaining attributes of the Relation along with the determinant above remain part of the
base Relation.
Copyright © 2021, ABES Engineering College
Third Normal Form (3NF)…
Example 1: In the previous section, in example 2, we converted the STUDENT Relation from 1NF to
2NF by decomposing it into two separate relations STUDENT_DETAIL and STUDENT_PHONE_NO.
Now let's analyze the STUDENT_DETAIL Relation, which is already in 2NF.
FDs in the above Relation are:
roll_no → first_name, middle_name, last_name, dob, gender, house_no, street_name, city, State,
pincode, city → state
The candidate key of the Relation is roll_no. In this Relation, we have a transitive dependency roll_no
→ city, city → state. This transitive dependency is causing data redundancy in the Relation. Therefore
this Relation is not in 3NF.
Copyright © 2021, ABES Engineering College
Third Normal Form (3NF)…
Example 1 (contd.): The normalization of this Relation to 3NF will involve the removal of transitive
dependencies. We need to remove the transitively dependent attribute(s) from the Relation by placing
the attribute(s) in a new Relation (CITY_STATE_MASTER) along with a copy of the determinant.
Copyright © 2021, ABES Engineering College
Third Normal Form (3NF)…
Example 2:
Let's take Relation R(A,B,C,D) with FD set = (A→B, B→C, C→D). Let's find if this Relation is in 3NF
or not.
The candidate key of the above Relation is (A).
Prime attributes – A. Non-prime attributes – B, C, D
Now let's analyze each FD for the 3NF condition:
A relation is in 3NF if at least one of the following condition holds in every non-trivial function dependency
X→Y:
• X is a super key
• Y is a prime attribute
A→B, A is a super key (we know all candidate keys are super keys) – 3NF condition met
B→C, B is not a super key, and C is not a prime attribute – 3NF condition failed
Therefore we can conclude that the above Relation is not 3NF.
Copyright © 2021, ABES Engineering College
Third Normal Form (3NF)…
Example 3:
Let’s take Relation R(A,B,C,D,E,F) with FD set = (AB→CDEF, BD→F). Let's find if this Relation is in
3NF or not.
The candidate key of the above Relation is (AB).
Prime attributes – A, B. Non-prime attributes – C, D, E, F
Now let's analyze each FD for the 3NF condition:
A relation is in 3NF if at least one of the following condition holds in every non-trivial function dependency
X→Y:
• X is a super key
• Y is a prime attribute
AB→CDEF, AB is a super key (we know all candidate keys are super keys) – 3NF condition met
BD→F, BD is not a super key, and F is not a prime attribute – 3NF condition failed
Therefore we can conclude that the above Relation is not 3NF.
Copyright © 2021, ABES Engineering College
Third Normal Form (3NF)…
Example 4:
Let's take Relation R(A,B,C,D,E) with FD set = (A→B, B→C, C→D, D→A). Let's find if this Relation is
in 3NF?
The candidate key of the above Relation is (AE), (DE), (CE), (BE).
Prime attributes – A, B, C, D, E. Non-prime attributes – NILL
Now let's analyze each FD for the 3NF condition:
A relation is in 3NF if at least one of the following condition holds in every non-trivial function
dependency X→Y:
• X is a super key
• Y is a prime attribute
A→B, A is not a super key, but B is a prime attribute – 3NF condition met.
B→C, B is not a super key, but C is a prime attribute – 3NF condition met.
C→D, C is not a super key, but D is a prime attribute – 3NF condition met.
D→A, D is not a super key, but A is a prime attribute – 3NF condition met.
Therefore we can conclude that the above Relation is in 3NF.
Copyright © 2021, ABES Engineering College
Boyce Codd Normal Form (BCNF)
Boyce-Codd Normal Form or BCNF is an extension to the 3NF and is also known as the 3.5
Normal Form. Some redundancies might still remain even after a Relation is in 3NF.
Definition:
For a Relation to be in BCNF, it should fulfill both the below two conditions
The Relation should be in 3NF
For each non-trivial functional dependency X→Y, X should be a Super Key
or
The Relation has no non-trivial functional dependency i.e. the Relation is an all-key Relation
(all attributes make the only candidate key)
How to convert 3NF to BCNF:
The normalization of 3NF relations to BCNF involves creating new Relation for every
dependency that violates the BCNF condition. The remaining attributes of the Relation, along
with the determinant (of the FD violating the BCNF condition) above, remain part of the base
Relation.
Copyright © 2021, ABES Engineering College
Boyce Codd Normal Form (BCNF)…
Example 1:
Relation R(A,B,C) with FD set = (A→B, B→C, C→A).
The candidate key of the above Relation is (A), (B), (C).
Prime attributes – A, B, C
Non-prime attributes – NILL
This Relation is in 3NF (use the concepts learned in the previous section). Now let's analyze
each FD for BCNF condition:
A→B, A is a super key – BCNF condition met.
B→C, B is a super key – BCNF condition met.
C→A, C is a super key – BCNF condition met.
All FDs are meeting the BCNF condition; therefore, we can conclude that the above Relation is
in BCNF.
Copyright © 2021, ABES Engineering College
Boyce Codd Normal Form (BCNF)…
Example 2:
Relation R(A,B,C) with FD set = (AB→C, C→B).
The candidate key of the above Relation is (AB), (AC).
Prime attributes – A, B, C
Non-prime attributes – NILL
This Relation is in 3NF (use the concepts learned in the previous section). Now let's analyze
each FD for BCNF condition:
AB→C, AB is a super key – BCNF condition met.
C→B, C is not a super key – BCNF condition not met.
All FDs are not meeting the BCNF condition; therefore, we can conclude that the above
Relation is not in BCNF.
Copyright © 2021, ABES Engineering College
Boyce Codd Normal Form (BCNF)…
Example 3: Below we have a STUDENT_SUBJECT_PROFESSOR Relation with columns student_id,
subject, and professor.
In the above Relation:
One student can enroll in multiple subjects. For example, a student with student_id 101 has opted
for subjects - Java & C++
For each subject, a professor is assigned to the student.
There can be multiple professors teaching one subject as we have for Java.
One professor teaches only one subject
Copyright © 2021, ABES Engineering College
Boyce Codd Normal Form (BCNF)…
Example 3 (contd.):
FDs for this Relation:
student_id, subject → professor
professor → subject
Candidate key for the Relation – (student_id, subject)
This Relation satisfies the 1st Normal form because all the values are atomic, column names
are unique, and all the values stored in a particular column are of the same domain.
This Relation also satisfies the 2nd Normal Form as there is no Partial Dependency.
And, there is no Transitive Dependency; hence the Relation also satisfies the 3rd Normal
Form.
But this Relation is not in Boyce-Codd Normal Form as FD; professor → subject does not
meet the BCNF condition. Here LHS (professor) is not a super key.
Copyright © 2021, ABES Engineering College
Boyce Codd Normal Form (BCNF)…
Example 3 (contd.):
To make this Relation satisfy BCNF, we will decompose this Relation into two relations
STUDENT_PROFESSOR and PROFESSOR_SUBJECT.
Copyright © 2021, ABES Engineering College
Finding the highest normal form of a relation
Steps to find the highest normal form of a Relation:
Find all possible candidate keys of the Relation.
Divide all attributes into two categories: prime attributes and non-prime attributes.
Check for BCNF normal form, then 3NF, and so on. By definition (implicit constraints) a
Relation will always be in 1NF.
Summary of definition of Normal forms:
2NF: No non-prime attribute should be partially dependent on Candidate Key (CK).
i.e. Proper Subset (CK/PK) → any non-prime attribute should not hold.
3NF: First, it should be in 2NF and at least one of the following condition holds in every non-
trivial function dependency X→Y:
X is a super key
Y is a prime attribute
BCNF: First, it should be in 3NF and if there exists a non-trivial dependency between two sets of
attributes X and Y such that X→Y, then X is Super Key
Copyright © 2021, ABES Engineering College
Finding the highest normal form of a relation…
The below Venn diagram shows the relationship between various normal forms. If a
Relation is in BCNF, it is already in 3NF, 2NF & 1NF. That's why we start checking a
Relation for BCNF and then move to 3NF and so on.
Copyright © 2021, ABES Engineering College
Finding the highest normal form of a relation…
Example 1:
Relation R(ABCDEFGH) with FDs = {ABC→DE, E→GH, H→G, G→H, ABCD→EF}
Step 1:
Candidate key of this Relation is (ABC)
Step 2:
Prime attributes: A, B, C
Non-prime attributes: D, E, F, G, H
Copyright © 2021, ABES Engineering College
Finding the highest normal form of a relation…
Example 1 (contd.):
Step 3:
Check for BCNF
ABC→DE, ABC is a super key – BCNF condition met.
E→GH, E is not a super key – BCNF condition not met.
H→G, H is not a super key – BCNF condition not met.
G→H, G is not a super key – BCNF condition not met.
ABCD→EF, ABCD is a super key – BCNF condition met.
As all FDs are not meeting BCNF conditions, this Relation is not in BCNF.
Copyright © 2021, ABES Engineering College
Finding the highest normal form of a relation…
Example 1 (contd.):
Check for 3NF
ABC→DE, ABC is a super key – 3NF condition met.
E→GH, E is not a super key, and G&H are non-prime attributes – 3NF condition not met.
H→G, H is not a super key, and G is a non-prime attribute – 3NF condition not met.
G→H, G is not a super key, and H is a non-prime attribute – 3NF condition not met.
ABCD→EF, ABCD is a super key – 3NF condition met.
As all FDs are not meeting 3NF conditions, this Relation is not in 3NF.
Copyright © 2021, ABES Engineering College
Finding the highest normal form of a relation…
Example 1 (contd.):
Check for 2NF
ABC→DE, LHS not a proper subset of candidate key (ABC) – 2NF condition met.
E→GH, LHS not a proper subset of candidate key (ABC) – 2NF condition met.
H→G, LHS not a proper subset of candidate key (ABC) – 2NF condition met.
G→H, LHS not a proper subset of candidate key (ABC) – 2NF condition met.
ABCD→EF, LHS not a proper subset of candidate key (ABC) – 2NF condition met.
As all FDs are meeting 2NF conditions, this Relation is in 2NF.
Copyright © 2021, ABES Engineering College
Finding the highest normal form of a relation…
Example 2:
Relation R(A,B,C,D) with FDs = {A→BCD, BC→AD, D→B}
Step 1:
Candidate keys of this Relation are (A), (BC), (CD).
Step 2:
Prime attributes: A, B, C, D
Non-prime attributes: NILL
Step 3:
Check for BCNF
A→BCD, A is a super key – BCNF condition met.
BC→AD, BC is a super key – BCNF condition met.
D→B, D is not a super key – BCNF condition not met.
As all FDs are not meeting BCNF conditions, this Relation is not in BCNF.
Copyright © 2021, ABES Engineering College
Finding the highest normal form of a relation…
Example 2 (contd.):
Check for 3NF
A→BCD, A is a super key – 3NF condition met.
BC→AD, BC is a super key – 3NF condition met.
D→B, D is not a super key, but B is a prime attribute – 3NF condition met.
As all FDs are meeting 3NF conditions, this Relation is in 3NF.
No need to check for 2NF, and as all 3NF relations are 2NF
Copyright © 2021, ABES Engineering College
Finding the highest normal form of a relation…
Example 3:
Relation R(A,B,C,D) with FDs = {AB→C, ABD→C, ABC→D, AC→D}
Step 1:
Candidate key of this Relation is (AB)
Step 2:
Prime attributes: A, B. Non-prime attributes: C, D
Step 3:
Check for BCNF
AB→C, AB is a super key – BCNF condition met.
ABD→C, ABD is a super key – BCNF condition met.
ABC→D, ABC is a super key – BCNF condition met.
AC→D, AC is not a super key – BCNF condition not met.
As all FDs are not meeting BCNF conditions, this Relation is not in BCNF.
Copyright © 2021, ABES Engineering College
Finding the highest normal form of a relation…
Example 3 (contd.):
Check for 3NF
AB→C, AB is a super key – 3NF condition met.
ABD→C, ABD is a super key – 3NF condition met.
ABC→D, ABC is a super key – 3NF condition met.
AC→D, AC is a not super key – 3NF condition not met.
As all FDs are meeting 3NF conditions, this Relation is in 3NF.
Check for 2NF
AB→C, LHS not a proper subset of candidate key (AB) – 2NF condition met.
ABD→C, LHS not a proper subset of candidate key (AB) – 2NF condition met.
ABC→D, LHS not a proper subset of candidate key (AB) – 2NF condition met.
AC→D, LHS not a proper subset of candidate key (AB) – 2NF condition met.
As all FDs are meeting 2NF conditions, this Relation is in 2NF.
Copyright © 2021, ABES Engineering College
Finding the highest normal form of a relation…
Example 4:
Relation R(A,B,C,D,E) with FDs = {AB→CDE, D→BE}
Step 1:
Candidate keys of this Relation are (AB), (AD)
Step 2:
Prime attributes: A, B, D
Non-prime attributes: C, E
Step 3:
Check for BCNF
AB→CDE, AB is a super key – BCNF condition met.
D→BE, D is not a super key – BCNF condition not met.
As all FDs are not meeting BCNF conditions, this Relation is not in BCNF.
Copyright © 2021, ABES Engineering College
Finding the highest normal form of a relation…
Example 4 (contd.):
Check for 3NF
AB→CDE, AB is a super key – 3NF condition met.
D→BE can be written as D→B, D→E
D→B, D is not a super key, but B is a prime attributes – 3NF condition met.
D→E, D is not a super key, and E is not a prime attribute – 3NF condition not met.
As all FDs are not meeting the 3NF conditions, this Relation is not in 3NF.
Check for 2NF
AB→CDE, LHS not a proper subset of candidate key (AB) – 2NF condition met.
D→B, LHS is a proper subset of candidate key (AD), but B is not a non-prime attribute –
2NF condition met.
D→E, LHS is a proper subset of candidate key (AD), and E is a non-prime attribute – 2NF
condition not met.
As all FDs are not meeting 2NF conditions, this Relation is not in 2NF.
So this Relation is in 1NF.
Copyright © 2021, ABES Engineering College
Decomposition of relations to convert them into
higher normal form
Till now we have understood:
The concept of 1NF, 2NF, 3NF & BCNF.
Find the highest normal form of a given Relation.
Let's use this knowledge to convert a given Relation into a higher normal form.
We will do this with a set of examples to bring more clarity.
Copyright © 2021, ABES Engineering College
Decomposition of relations to convert them into
higher normal form…
Example 1:
Given Relation R(A,B,C,D,E) with FDs = {A→B, B→E, C→D)
Step 1 – Find the current normal form of the Relation
Candidate Key – (AC)
Prime attributes – A, C
Non-prime attributes – B, D
Using the process learned in the section above, we can find that R is in 1NF.
Step 2 – Find the FDs that are creating a problem
A→B (This is a partial dependency as (A) being a proper subset of candidate key (AC)
is determining a non-prime attribute (B) – Thus violating 2NF
C→D (This is a partial dependency as (C) being a proper subset of candidate key (AC)
is determining a non-prime attribute (B) – Thus violating 2NF
Copyright © 2021, ABES Engineering College
Decomposition of relations to convert them into
higher normal form…
Example 1 (contd.):
Step 3 – Decompose the Relation to remove the anomalies identified above
As, we have identified two partial dependencies in the above Relation, thus violating
2NF. From previous sections, we know:
How to convert 1NF to 2NF:
The normalization of 1NF relations to 2NF involves the removal of partial functional dependencies. If a
partial dependency exists, we remove the partially dependent attribute(s) (along with their dependents,
if any) from the relation by placing them in a new relation along with a copy of their determinant. The
remaining attributes of the relation along with the determinant above remain part of the base relation.
We will create two separate relations to handle two partial dependencies A→B
(including B→E, as E is dependent on B) & C→D.
i.e. R1(A,B,E), R2(C,D). After removing the partial dependent (and their dependents)
attributes, the base Relation will be reduced to R3(A,C).
Copyright © 2021, ABES Engineering College
Decomposition of relations to convert them into
higher normal form…
Example 1 (contd.):
Step 4 – Check again if the above-decomposed relations have achieved the highest normal form.
Relation R1(A,B,E)
Candidate Key – (A). Prime attributes – A. Non-prime attributes – B, E
We see there is transitive dependency here B→E; therefore, this Relation is not in 3NF
R2(C,D) & R3(A,C) are both in BCNF (you can check by concepts learned in the earlier sections).
Step 5 – Decompose the Relation R1(A,B,E) to remove the anomalies identified above.
We have identified a transitive dependency in the above Relation, thus violating 3NF. We know:
How to convert 2NF to 3NF:
The normalization of 2NF relations to 3NF involves the removal of transitive dependencies. If a transitive dependency
exists, we remove the transitively dependent attribute(s) from the relation by placing the attribute(s) in a new relation
along with a copy of the determinant. The remaining attributes of the relation along with the determinant above remain
part of the base relation.
We will create a separate Relation to handle the transitive dependency B→E
i.e., R12(B,E). After removing the transitive dependent attribute, the base Relation will be reduced to
R11(A,B).
Copyright © 2021, ABES Engineering College
Decomposition of relations to convert them into
higher normal form…
Example 1 (contd.):
Step 6 – Check again if the above-decomposed relations have achieved the highest normal form.
R11(A,B) &R12(B,E) are both in BCNF (you can check by concepts learned in the earlier
sections).
Step 7 – After carrying out the decomposition, we need to make sure that one of the decomposed relations
contains the candidate key of the Relation R(A,B,C,D,E) i.e (AC). Here R3(A,C) meets the
condition.
Conclusion: Relation R(A,B,C,D,E) with FDs = {A→B, B→E, C→D) is in 1NF. It can be decomposed into
4 separate relations - R11(A,B), R12(B,E), R2(C,D) & R3(A,C) to achieve the highest normal form of
BCNF.
Copyright © 2021, ABES Engineering College
Decomposition of relations to convert them into
higher normal form…
Example 2:
Given Relation R(A,B,C,D) with FDs = {A→B, B→C, C→D)
Step 1 – Find the current normal form of the Relation
Candidate Key – (A)
Prime attributes – A
Non-prime attributes – B, C, D
Using the process learned in the section above, we can find that Relation R is in 2NF.
Step 2 – Find the FDs that are creating a problem
B→C, transitive dependency– Thus violating 3NF
C→D, transitive dependency – Thus violating 3NF
Copyright © 2021, ABES Engineering College
Decomposition of relations to convert them into
higher normal form…
Example 2 (contd.):
Step 3 – Decompose the Relation to remove the anomalies identified above
As, we have identified two transitive dependencies in the above Relation, thus violating 3NF.
From previous sections, we know:
How to convert 2NF to 3NF:
The normalization of 2NF relations to 3NF involves the removal of transitive dependencies. If a transitive
dependency exists, we remove the transitively dependent attribute(s) (along with their dependents, if any)
from the relation by placing the attribute(s) in a new relation along with a copy of the determinant. The
remaining attributes of the relation along with the determinant above remain part of the base relation.
We will create two separate relations to handle two transitive dependencies B→C, C→D
i.e. R1(BC) & R2(CD). After removing the partial dependent attributes, the base Relation will be
reduced to R3(AB).
Copyright © 2021, ABES Engineering College
Decomposition of relations to convert them into
higher normal form…
Example 2 (contd.):
Step 4 – Check again if the above-decomposed relations have achieved the highest normal
form.
R1(BC), R2(CD) & R3(AB) are all in BCNF (you can check by concepts learned in the
earlier sections).
Step 5 – After carrying out the decomposition, we need to make sure that one of the
decomposed relations contains the candidate key of the Relation R(A,B,C,D) i.e (A). Here
R3(A,B) meets the condition.
Conclusion: Relation R(A,B,C,D) with FDs = {A→B, B→C, C→D) is in 2NF. It can be
decomposed into 3 separate relations - R1(BC), R2(CD) & R3(AB) to achieve the highest
normal form of BCNF.
Copyright © 2021, ABES Engineering College
Decomposition of relations to convert them into
higher normal form…
Example 3:
Relation R(A,B,C,D) with FDs = {A→BCD, BC→AD, D→B}
Step 1 – Find the current normal form of the Relation
Candidate Keys – (A), (BC), (CD)
Prime attributes – A, B, C, D
Non-prime attributes – NILL
Using the process learned in the section above, we can find that Relation R is in 3NF
and not in BCNF
Step 2 – Find the FDs that are creating a problem
D→B, D is not a super key – Thus violating BCNF
Copyright © 2021, ABES Engineering College
Decomposition of relations to convert them into
higher normal form…
Example 3 (contd.):
Step 3 – Decompose the Relation to remove the anomalies identified above
We have identified one dependency violating the BCNF condition. From previous
sections, we know:
How to convert 3NF to BCNF:-
The normalization of 3NF relations to BCNF involves creating a new relation for every
dependency which is violating the BCNF condition. The remaining attributes of the relation along
with the determinant (of the FD violating the BCNF condition) above remain part of the base
relation.
We will create one separate Relation to handle the dependency D→B
i.e., R1(D,B). After removing the dependent attributes of the above dependency from
the base Relation, it will be reduced to R2(A,C,D).
Copyright © 2021, ABES Engineering College
Decomposition of relations to convert them into
higher normal form…
Example 3 (contd.):
Step 4 – Check again if the above-decomposed relations have achieved the highest normal
form.
R1(D,B) & R2(A,C,D) are both in BCNF (you can check by concepts learned in the
earlier sections).
Step 5 – BCNF decompositions are not always dependency preserving; therefore, we don't
need to make sure that all candidate keys of the base Relation are there in the
decomposed relations.
Conclusion: Relation R(A,B,C,D) with FDs = {A→BCD, BC→AD, D→B} is in 3NF. It can be
decomposed into two separate relations - R1(D,B) & R2(A,C,D) to achieve the highest normal
form of BCNF.
Copyright © 2021, ABES Engineering College
Fourth Normal Form (4NF)
The fourth Normal Form comes into the picture when non-trivial Multivalued Dependency
(MVD) occurs in any Relation. These relations need to be identified and decomposed further
into a 4NF decomposition to improve database design.
Definition:
For a Relation to be in 4NF, it should fulfill the below two conditions:
The Relation should be in BCNF
The Relation should not have any non-trivial Multivalued Dependency (MVD).
Multivalued Dependency (MVD):
Multivalued dependencies are a consequence of 1NF, which disallows multivalued
attributes in a tuple and the accompanying process of converting an un-normalized
Relation into 1NF.
Suppose we have two or more multivalued independent attributes in the same Relation.
In that case, we get into having to repeat every value of one attribute with every value of
the other attribute to keep the relation state consistent and maintain the independence
among the attributes involved.
A non-trivial multivalued dependency specifies this constraint.
Copyright © 2021, ABES Engineering College
Fourth Normal Form (4NF)…
Example:
Copyright © 2021, ABES Engineering College
Fourth Normal Form (4NF)…
4NF Normalization Process:
Copyright © 2021, ABES Engineering College
Fifth Normal Form (5NF)
Fifth Normal Form in Database Normalization is generally not implemented in real-life database
design; however, we should know what it is. It is also known as Project Join Normal Form
(PJNF).
Definition:
A Relation R is in 5NF if and only if it satisfies the following conditions:
R should be already in 4NF.
It should not have any join dependency
Joint dependency – If the join of R1 and R2 over C is equal to relation R then
we can say that a join dependency (JD) exists, where R1 and R2 are the
decomposition R1(A, B, C) and R2(C, D) of a given relations R (A, B, C, D).
Otherwise, R1 and R2 are a lossless decomposition of R.
A JD ⋈ {R1, R2, …, Rn} is said to hold over a relation R if R1, R2, ….., Rn is
a lossless-join decomposition.
A Relation R is in 5NF if and only if it cannot be decomposed further into two or more relations
with a loss-less join Property, ensuring that no spurious or extra tuples are generated when
relations are reunited through a natural join.
Copyright © 2021, ABES Engineering College
Fifth Normal Form (5NF)…
Example:
In the above 4NF Relation:
One student can enroll in multiple subjects. For example, the
student with student_id 101 has opted for subjects – Java, C++ &
C#
Multiple professors can teach each subject. For example, Java is
taught by Amit, Mohit & Payal.
Each professor can teach multiple subjects. For example, Amit can
teach Java & C++.
From the ER modelling perspective, the above Relation is the outcome of
a ternary relationship type between student, subject, and professor
Copyright © 2021, ABES Engineering College
Fifth Normal Form (5NF)…
Example (contd.):
If we decompose the above Relation into three separate binary relations as below, We can see from
the above decomposition that there is a loss of information.
Student 101 is studying subjects – Java, C++ & C#.
Student 101 is being taught by two professors – Amit & Rajan.
Amit can teach – Java & C++, and Rajan can teach – C# & C++.
From the above information, it is impossible to decipher who is teaching C++ to student 101. Hence
there is a loss of information; therefore, this decomposition is not lossless. There is no join
dependency between the base Relation and the decomposed relations.
Hence we can conclude that the base Relation student_subject_professor is in 5NF as it
cannot be further non-loss decomposed.
Copyright © 2021, ABES Engineering College
Conditions for relation decomposition
One thing common across the normalization process is the
decomposition of base relations into two or more relations to achieve a
higher normal form.
When we decompose a Relation into two or more relations to achieve a higher
normal form, we need to make sure that the decomposition is:
Lossless (non-additive) join decomposition
Dependency preserving decomposition (optional in case of BCNF decomposition)
Copyright © 2021, ABES Engineering College
Lossless (Non-additive) join decomposition
Lossless (non-additive) join decomposition ensures that:
No spurious tuples are generated when a natural join operation is applied to the relations
resulting from the decomposition.
The condition of no spurious tuples should hold on every legal relation state. The lossless
join property is always defined for a specific set F of functional dependencies.
The word loss in lossless refers to loss of information, not to the loss of tuples.
If we decompose a Relation r(R) into r1 (R1) and r2 (R2) such that R1 Ս R2 = R (attribute
preservation condition), then it is said to be lossless if it satisfies r1 ⋈ r2 = r with no new
tuples added and no tuples eliminated.
If we decompose a Relation r(R) into r1 (R1), r2 (R2)….rk (Rk) such that R1 Ս R2….Ս Rk = R
(attribute preservation condition) is said to be lossless if it satisfies r1 ⋈ r2 ⋈ …rk = r with no new
tuples added and no tuples eliminated.
Copyright © 2021, ABES Engineering College
Lossless (Non-additive) join decomposition…
Example 1:
Case 1:
r(R): r1(R1): r2(R2): r1(R1) ⋈ r2(R2)
A B C A B A C A B C
a1 b1 c1 a1 b1 a1 c1 a1 b1 c1
a2 b2 c1 a2 b2 a2 c1 a1 b1 c2
a1 b1 c2 a3 b2 a1 c2 a1 b1 c3
a3 b2 c3 a3 c3 a2 b2 c1
a1 b1 c3 a1 c3 a2 b2 c4
a2 b2 c4 a2 c4 a3 b2 c3
In case 1, we can see that R1 U R2 = R and r1 ⋈ r2 = r. It is a lossless join decomposition.
Copyright © 2021, ABES Engineering College
Lossless (Non-additive) join decomposition…
Example 1 (contd.): r(R): r1(R1): r2(R2): r1(R1) ⋈ r2(R2)
A B C A B A C A B C
Case 2: a1 b1 c1 a1 b1 a1 c1 a1 b1 c1 √
a2 b2 c1 a2 b2 a2 c1 a1 b1 c2 X
a1 b2 c2 a1 b2 a1 c2 a1 b1 c3 √
a3 b2 c3 a3 b2 a3 c3 a2 b2 c1 √
a1 b1 c3 a2 b1 a1 c3 a2 b2 c4 X
a2 b1 c4 a2 c4 a1 b2 c1 X
a1 b2 c2 √
a1 b2 c3 X
a3 b2 c3 √
a2 b1 c1 X
a2 b1 c4 √
√ Correct tuple
X Spurious tuple
In case 2, we can see that R1 U R2 = R and r1 ⋈ r2 ≠ r. It is not a lossless join decomposition.
Copyright © 2021, ABES Engineering College
Lossless (Non-additive) join decomposition…
For lossless join decomposition using FD set, the following conditions must hold:
Union of Attributes of R1 and R2 must be equal to attribute of R. Each attribute of R must
be either in R1 or in R2.
Att(R1) U Att(R2) = Att(R)
The intersection of Attributes of R1 and R2 must not be NULL.
Att(R1) ∩ Att(R2) ≠ Φ
The common attribute must be a key for at least one Relation (R1 or R2)
Att(R1) ∩ Att(R2) → Att(R1) or Att(R1) ∩ Att(R2) → Att(R2)
Copyright © 2021, ABES Engineering College
Lossless (Non-additive) join decomposition…
Example 2:
A Relation R(A,B,C,D,E,F) with FD set {AB→C, C→D, D→EF, F→A, D→B} is decomposed into R1(ABC),
R2(CDE), R3(EF)
Condition 1:- Att(R1) U Att(R2) U Att(R3) = (A,B,C,D) = R(A,B,C,D) – condition met
As Join (⋈) is a binary operation so we will take 2 relations at a time
Copyright © 2021, ABES Engineering College
Lossless (Non-additive) join decomposition…
Example 2 (contd.):
Att(R1) ∩ Att(R2) = (C) ≠ Φ – condition met
Let's check if (C) is a Key in either R1 or R2.
Find C+ = {C,D,E,F,A,B}, so we can see (C) can determine all attributes of both R1 & R2, hence it is a Key
in both R1 & R2 - condition met
So, R1(A,B,C) ⋈ R2(C,D,E) = R12(A,B,C,D,E) is a lossless join
Att(R12) ∩ Att(R3) = (E) ≠ Φ – condition met
Let's check if (E) is a Key in either R12 or R3.
Find E+ = {E}, so we can see (E) cannot determine all attributes of either R12 or R3 – condition not met
So, R12(A,B,C,D,E) ⋈ R3(E,F) = R(A,B,C,D,E,F) is not a lossless join
Therefore we can conclude that the whole decomposition R1 (ABC), R2 (CDE) & R3 (EF) is not a
lossless join
Copyright © 2021, ABES Engineering College
Lossless (Non-additive) join decomposition…
Algorithm to test for lossless (Non-additive) Join Property
Input: A universal Relation R, a decomposition D = {R1, R2, …, Rm} of R, and a set F of functional dependencies.
Output: A decision whether decomposition is lossless or not.
1. Create an initial matrix S with one row i for each Relation Ri in D, and one column j for each attribute Aj in
R.
2. For each row i representing Relation schema Ri
{For each column j representing attribute Aj
{If Relation Ri includes attribute Aj:
Put the symbol aj i.e. S(i, j): = aj
Otherwise
Put the symbol bij i.e. S(i, j): = bij
}}
Copyright © 2021, ABES Engineering College
Lossless (Non-additive) join decomposition…
Algorithm to test for lossless (Non-additive) Join Property (contd.)
3. Repeat the following loop until a complete loop execution results in no changes to S {For each
functional dependency X→Y in F
{For all rows in S that have the same symbols in the columns corresponding to attributes in X
{Make the symbols in each column that correspond to an attribute in Y be the same in all
these rows as follows:
If any of the rows have an 'a' symbol for the column, set the other rows to that same 'a'
symbol in the column.
If no 'a' symbol exists for the attribute in any of the rows, choose one of the 'b' symbols that
appears in one of the rows for the attribute and set the other rows to that same 'b' symbol in
the column ;};
}}}
4. If a row is made up entirely of 'a’ symbols, then the decomposition has the non-additive join
property; otherwise, it does not.
Copyright © 2021, ABES Engineering College
Lossless (Non-additive) join decomposition…
Example 3:
R(A,B,C,D,E)
Decomposition is: R1(AD) ; R2(AB) ; R3(BE) ; R4(CDE) ; R5(AE)
Set of functional dependencies FD = {A→C, B→C, C→D, DE→C, CE→A}. Verify whether this
decomposition is lossless or lossy.
Solution: Initialization of matrix: 1 2 3 4 5
A B C D E
1 AD a1 b12 b13 a4 b15
2 AB a1 a2 b23 b24 b25
3 BE b31 a2 b33 b34 a5
4 CDE b41 b42 a3 a4 a5
5 AE a1 b52 b53 b54 a5
Now consider a set of functional dependencies F= {A→C, B→C, C→D, DE→C, CE→A}
Copyright © 2021, ABES Engineering College
Lossless (Non-additive) join decomposition…
1. A → C 2. B → C
A B C D E A B C D E
AD a1 b12 b13 a4 b15 AD a1 b12 b13 a4 b15
AB a1 a2 b13 b24 b25 AB a1 a2 b13 b24 b25
BE b31 a2 b33 b34 a5 BE b31 a2 b13 b34 a5
CDE b41 b42 a3 a4 a5 CDE b41 b42 a3 a4 a5
Example 3 (contd.): AE a1 b52 b13 b54 a5 AE a1 b52 b13 b54 a5
3. C → D 4. DE → C
A B C D E A B C D E
AD a1 b12 b13 a4 b15 AD a1 b12 b13 a4 b15
AB a1 a2 b13 a4 b25 AB a1 a2 b13 a4 b25
BE b31 a2 b13 a4 a5 BE b31 a2 a3 a4 a5
CDE b41 b42 a3 a4 a5 CDE b41 b42 a3 a4 a5
AE a1 b52 b13 a4 a5 AE a1 b52 a3 a4 a5
5. CE → A 6. A → C
A B C D E A B C D E
AD a1 b12 b13 a4 b15 AD a1 b12 a3 a4 b15
AB a1 a2 b13 a4 b25 AB a1 a2 a3 a4 b25
BE a1 a2 a3 a4 a5 BE a1 a2 a3 a4 a5
CDE a1 b42 a3 a4 a5 CDE a1 b42 a3 a4 a5
AE a1 b52 a3 a4 a5 AE a1 b52 a3 a4 a5
A → C, B → C, C → D,
DE → C, CE → A, A → C
A B C D E
AD a1 b12 a3 a4 b15
AB a1 a2 a3 a4 b25
BE a1 a2 a3 a4 a5 All 'a' symbols are in this row
CDE a1 b42 a3 a4 a5
AE a1 b52 a3 a4 a5
Thus, decomposition of R(A,B,C,D,E) in to R1(AD) ; R2(AB) ; R3(BE) ; R4(CDE) ; R5(AE) is a lossless decomposition.
Copyright © 2021, ABES Engineering College
Dependency preserving decomposition
Dependency preserving or preserving functional dependencies
For a Relation R to be recoverable, its decomposition must be lossless as explained in
earlier section. In addition to this, the decomposition must satisfy another property known as
dependency preservation.
It states, if a Relation R is decomposed into relations R1 and R2, then all functional
dependencies of R either must be a part of R1 or R2 or must be derivable from the
combination of FD’s of R1 and R2.
Need of dependency preservation:
The set of FD’s on original Relation defines the integrity constraints that Relation needs to
meet. If any decomposition does not preserve the dependencies of original Relation impose
an unnecessary burden on the RDBMS by joining all these decomposed relations to check
that the constraints are not violated in case of any update in any of the decomposed
relations. Dependency preservation is optional for BCNF decomposition.
Copyright © 2021, ABES Engineering College
Dependency preserving decomposition…
Definition:
A Decomposition D = {R1, R2, R3….,Rn} of R is dependency preserving w.r.t a set F of
Functional dependency if (F1 U F2 U … U Fn)+ = F+.
How to check:
Consider a Relation R with some functional dependencies set F. R is decomposed or divided
into R1 with FD {F1} and R2 with {F2}, then there can be three cases:
{F1 U F2} = F -----> Decomposition is dependency preserving.
{F1 U F2} is a subset of F -----> Decomposition is not Dependency preserving.
{F1 U F2} is a super set of F -----> This case is not possible.
Copyright © 2021, ABES Engineering College
Dependency preserving decomposition…
Example 1:
Let a Relation R (ABCD) and functional dependency set F= {AB→C, C→D, D→A}. Relation R is
decomposed into R1(ABC) and R2(CD). Check whether decomposition is dependency
preserving or not.
Solution:
Step 1: For decomposed Relation R1(A, B, C) and R2(C, D), let’s find the functional
dependency of each sub Relation as F1 and F2 using closure property.
To find FD’s for Relation R1 i.e. F1 we will consider all combination of attributes that belong to
Relation R1(ABC) i.e., find closure of A, B, C, AB, BC, and AC using original FD set F (Note:
ABC is not considered as it is always ABC due to triviality) and then eliminate such FD’s in
which any attribute appears which is not part of R1 Relation. No need to add trivial functional
dependencies
(A)+ = {A}) // Trivial hence ignore
(B)+ = {B} // Trivial hence ignore
Copyright © 2021, ABES Engineering College
Dependency preserving decomposition…
Example 1 (contd.):
(C)+ = {C,A,D} but D can't be part of the closure because D is not present R1.
= Therefore, {C}+ = (C,A} now we will write FD as C→CA, But C on RHS is trivial attribute.
Hence remove from RHS. Finally, FD using {C}+ is C→A ………………………………………….(1)
(AB)+ = {A,B,C,D} but D can't be in closure as D is not present R1.
= {A,B,C}. Therefore FD will be AB→C // Removing trivial attributes (AB) from RHS…..(2)
(BC)+ = {B,C,D,A} but D can't be in closure as D is not present R1.
= {A,B,C}. Therefore FD will be BC→A // Removing trivial attributes (BC) from RHS...…(3)
(AC)+ = {A,C,D} but D can't be in closure as D is not present R1.
= {A,C}. Ignoring AC (trivial). Therefore no new FD is derived using AC.
Therefore F1 = {C→A, AB→C, BC→A} using (1), (2) & (3)
Copyright © 2021, ABES Engineering College
Dependency preserving decomposition…
Example 1 (contd.):
To find FD’s for Relation R2, i.e., F2, we will consider all combination of attributes that
belongs to Relation R2(CD), i.e., C, D, CD (Note: CD is not considered as it is always CD due to
triviality)
Similarly, we can derive for F2 = {C→D}
Step 2: Test whether original Relation functional dependency {AB→C, C→D, D→A} exist in {F1
U F2} or F = {F1 U F2}.
{F1 U F2} = {C→A, AB→C, BC→A, C→D}
AB→C is present in {F1 U F2}.
C→D is present in {F1 U F2}.
D→A is not present in any of F1 or F2 nor in {F1 U F2}+. Hence this dependency is not
preserved or we can say F1 U F2 is a subset of F.
So given decomposition is not dependency preserving.
Copyright © 2021, ABES Engineering College
Dependency preserving decomposition…
Example 2:
Let a Relation R(A,B,C,D,E) and functional dependency set F = {A→B, B→C, C→D, D→A}. Relation R is
decomposed into R1(ABC) and R2(CDE). Check whether decomposition is dependency preserving or not.
Solution:
Step 1: For decomposed Relation R1(ABC) and R2(CDE), let’s find the functional dependency of each sub
Relation as F1 and F2 using closure property.
To find FD’s for Relation R1, i.e., F1, we will consider all combination of attributes that belongs to
Relation R1(ABC), i.e., find closure of A, B, C, AB, BC and AC using original FD set F
(A)+ = {A,B,C,D} but we will ignore A (trivial) & D (D is not part of R1). Therefore, {A} + = {B,C}.
We can write Functional dependency derived from A as A→BC …………..……………… (1)
(B)+ = {B,C,D,A}. Ignoring B (trivial) & D (D not the part of R1). Therefore, {B} + = {C,A}.
We can write Functional dependency derived from B as B→CA ……………..…………… (2)
Copyright © 2021, ABES Engineering College
Dependency preserving decomposition…
Example 2 (contd.):
(C)+ = {C,D,A,B}. Ignoring C (trivial) & D (D not the part of R1). Therefore, {C} + = {B,A}.
We can write Functional dependency derived from C as C→BA ..………………………… (3)
(AB) + = {A,B,C,D}. Ignoring AB (trivial) & D (D not the part of R1). Hence {AB} + = {C}.
We can write Functional dependency derived from AB as AB→C. But please note that this
is duplicate FD because attribute A alone can derive C in equation (1) above or we can say we will not
check any combination of attributes, with attribute(s) which itself is capable of acting as the key of the
Relation R. Hence we will ignore this FD as part of F1 set.
Similarly, (A)+, (B)+, (C)+ derive all attributes of Relation R; hence testing the combination like AC & BC will
not going to add any new functional dependency in the set F1.
Therefore final F1 = {A→BC, B→CA, C→BA}
Copyright © 2021, ABES Engineering College
Dependency preserving decomposition…
Example 2 (contd.):
To find FD’s for Relation R2 i.e. F2, we will consider all the combination of attributes of R2(CDE) i.e. C, D, E, CD,
CE, DE using original functional dependency set F = F= {A→B, B→C, C→D, D→A}.
(C)+ = {C,D,A,B}. Ignoring C (trivial) & AB (AB not the part of R2). Therefore, {C}+ = {D}.
We can write Functional dependency derived from C as C→D ……………………………… (1)
(D)+ = {D,A,B,C}. Ignoring D (trivial) & AB (AB not the part of R2). Therefore, {D}+ = {C}.
We can write Functional dependency derived from D as D→C ……………………………… (2)
(E)+ = {E}. Ignoring trivial attribute E, therefore no FD using E.
(CD) + = {C,D,A,B}. Ignoring CD (trivial) & AB (AB not the part of R2). Therefore no new FD is derived using CD.
(DE)+ = {D,E,A,B,C}. Ignoring DE (trivial) & AB (AB not part of R2). Hence {DE}+ = {C}.
We can write Functional dependency derived from DE as DE→C. But please note this is duplicate FD because D
alone can derive C in equation (2). Hence we will ignore this FD.
Copyright © 2021, ABES Engineering College
Dependency preserving decomposition…
Example 2 (contd.):
(CE)+ = {C,E,D,A,B}. Ignoring CE (trivial) & AB (AB not part of R2). Hence {CE} + = {D}.
We can write Functional dependency derived from CE as CE→D. But please note this is duplicate FD
because C alone can derive D in equation (1). Hence we will ignore this FD.
Therefore final F2 = {C→D, D→C}
Step 2: Test whether original Relation functional dependency F = {A→B, B→C, C→D, D→A} exist in {F1 U F2} or F
= {F1 U F2}.
F1 = {A→BC, B→CA, C→BA}
F2 = {C→D, D→C}
{F1 U F2} = {A→BC, B→CA, C→BA, C→D, D→C}
A→B, is present in {F1 U F2} (applying the decomposing rule on A→BC)
B→C, is present in {F1 U F2} (applying the decomposing rule on B→CA)
Copyright © 2021, ABES Engineering College
Dependency preserving decomposition…
Example 2 (contd.):
C→D, is present in {F1 U F2}
D→A can be derived using axioms on {F1 U F2} i.e., using D→C & C→BA, we can derive D→BA (using
transitivity rule) & then applying the decomposing rule, we can infer D→B & D→A. Hence, D→A is present in {F1
U F2}. This means F= {F1 U F2}.
So given decomposition of Relation R is dependency preserving.
Copyright © 2021, ABES Engineering College
Question
Q. Relation R with an associated set of functional dependencies, F is
decomposed into BCNF. The redundancy (arising out of functional
dependencies) in the resulting set relations is. (GATE 2002)
A. Zero
B. More than zero but less than that of an equivalent 3NF decomposition
C. Proportional to the size of F+
D. Indeterminate
Copyright © 2021, ABES Engineering College
Question
Q. Which normal form is based on the concept of ‘full functional dependency’?
(ISRO 2011)
A full functional dependency is a state of database
A. First Normal Form normalization similar to Second Normal Form (2NF).
It means that the schema should meet the
B. Second Normal Form requirements of First Normal Form (1NF), and all
non-key attributes are fully functionally dependent
C. Third Normal Form on the primary key and partial dependency on the
D. Fourth Normal Form candidate key should not exist.
So, Option (B) is correct.
Copyright © 2021, ABES Engineering College
Question
Q. If every non-key attribute is functionally dependent on the primary key, then the
relation is in __________ . (UGC NET 2017)
A. First normal form
B. Second normal form
Conditions for various normal forms:
C. Third normal form 1 NF – A relation R is in first normal form (1NF) if and only if all
underlying domains contain atomic values only.
D. Fourth normal form 2 NF – A relation R is in second normal form (2NF) if and only if it is in
1NF and every non-key attribute is fully dependent on the primary key.
3 NF – A relation R is in third normal form (3NF) if and only if it is in 2NF
and every non-key attribute is non-transitively dependent on the primary key.
BCNF – A relation R is in Boyce-Codd normal form (BCNF) if and only if every
determinant is a candidate key.
Example:
Relation R(XYZ) with functional dependencies {X -> Y, Y -> Z, X -> Z}.
Notice here Y -> Z, in question it is not mention that non prime attribute
is only dependent on primary key so this FD is perfectly valid.
This relation is in 2NF but not in 3NF because of every non-key attribute
is transitively dependent on the primary key. Here {X} will be candidate key.
So, option (B) is correct.
Copyright © 2021, ABES Engineering College
Question
Q. Consider the following dependencies and the BOOK table in a relational
database design. Determine the normal form of the given relation. (ISRO 2013)
ISBN → Title
ISBN → Publisher
Publisher → Address
A. First Normal Form ISBN is the candidate Key.
B. Second Normal Form BCNF is ruled out as Publisher is not a Key.
3NF is ruled out as there is transitive
C. Third Normal Form dependence Publisher -> Address. Also neither
D. BCNF Publisher is a key nor Address is a prime
attribute.
The relation is in 2NF as there is no partial
dependency.
Copyright © 2021, ABES Engineering College
Question
Q. For a database relation R(a,b,c,d), where the domains a, b, c, d include only atomic
values, only the following functional dependencies and those that can be inferred from
them hold: (GATE 1997 & UGC NET 2017)
{a → c, b → d}
This relation is:-
A. In first normal form but not in second normal form
B. In second normal form but not in first normal form
C. In third normal form
Candidate Key of above relation is :- ab
D. None of the above a and b is partial attribute (part of the CK) that’s why the given
FDs are partially dependents.
In 2NF there must not be partially dependents FD and we know that
every table is already in 1NF. Hence, this relation is in first
normal form but not in second normal form.
Option (A) is correct.
Copyright © 2021, ABES Engineering College
Question
Q. Consider the following database relations containing the attributes:- (GATE 1998)
Book_id
Subject_Category_of_book
Name_of_Author
Nationality_of_Author with
Book_id as the Primary Key.
(a) What is the highest normal form satisfied by this relation ?
(b) Suppose the attributes Book_title and Author_address are added to the relation, and the
primary key is changed to (Name_of_Author, Book_Title), what will be the highest normal form
satisfied by the relation? (a) R (Book_id, Subject_Category_of_book, Name_of_Author ,
(a) BCNF Nationality_of_Author) = R (A, B, C, D)
Given that Book_id as the Primary Key, therefore { A → B, C, D}
(b)1NF Hence Given relation is in BCNF.
(b) Two attributes Book_title and Author_address are added to the relation.
Then , R (Book_id, Subject_Category_of_book, Name_of_Author , Nationality_of_Author,
Book_title, Author_address) = R (A, B, C, D, E, F ). Given (Name_of_Author, Book_Title)
is now primary key, therefore { C, E → A, B, D, F} & { A → B, C, D}
Candidate key of this relation is (A, E) and there is partial dependency { A → B, C, D},
so the relation is in 1NF
Copyright © 2021, ABES Engineering College
Question(GATE 2004)
The relation scheme Student Performance (name, courseNo, rollNo, grade) has the
following functional dependencies:
name, courseNo → grade
rollNo, courseNo → grade
name → rollNo
rollNo → name
The highest normal form of this relation scheme is:-For easy understanding let's say
1. 2NF attributes (name, courseNo, rollNo,
2. 3NF grade) be (A,B,C,D). Then given FDs
3. BCNF are as follows:
4. 4NF AB->D, CB->D, A->C, C->A
Here there are two Candidate keys, AB
and CB. Now AB->D and CB->D satisfy
BCNF as LHS is superkey in both. But,
A->C and C->A, doesn't satisfy BCNF.
Hence we check for 3NF for these 2
Copyright © 2021, ABES Engineering College
Question
Q. A table has fields, F1,F2,F3,F4,F5 with the following functional dependencies: (GATE 2005)
F1→F3
F2→F4
(F1.F2)→F5
In terms of Normalization, this table is in:- First Normal Form - A relation is in first
normal form if every attribute in that
relation is singled valued attribute. Second
A. 1NF Normal Form - A relation is in 2NF if it
B. 2NF has No Partial Dependency, i.e., no non-prime
attribute (attributes which are not part of
C. 3NF any candidate key) is dependent on any proper
D. None of these subset of any candidate key of the table. This
table has Partial Dependency f1->f3, f2->
f4 given (F1,F2) is Key So answer is A
Copyright © 2021, ABES Engineering College
Question
Q. Let R (A, B, C, D, E, P, G) be a relational schema in which the following functional
dependencies are known to hold: AB → CD, DE → P, C → E, P → C and B → G. The
relational schema R is (GATE 2008)
A. In BCNF
B. In 3NF, but not in BCNF
C. In 2NF, but not in 3NF Candidate key = AB
B->G is partial
D. Not in 2NF dependency
So, not in 2NF
Copyright © 2021, ABES Engineering College
Question
Q. Consider the following relational schemes for a library database: (GATE 2008)
Book (Title, Author, Catalog_no, Publisher, Year, Price)
Collection (Title, Author, Catalog_no) with in the following functional dependencies:
Title, Author --> Catalog_no
Catalog_no --> Title, Author, Publisher, Year
Publisher, Title, Year --> Price
Assume {Author, Title} is the key for both schemes. Which of the following statements is
true?
A. Both Book and Collection are in BCNF
B. Both Book and Collection are in 3NF only
C. Book is in 2NF and Collection is in 3NF
D. Both Book and Collection are in 2NF only
Copyright © 2021, ABES Engineering College
Question
Q. Relation R has eight attributes ABCDEFGH. Fields of R contain only
atomic values. F = {CH -> G, A -> BC, B -> CFH, E -> A, F -> EG} is a set of
functional dependencies (FDs) so that F+ is exactly the set of FDs that hold
for R.
The Relation R is:- (GATE 2013)
A. In 1NF, but not in 2NF The table is not in 2nd Normal Form as the
B. In 2NF, but not in 3NF non-prime attributes are dependent on
subsets of candidate keys. The candidate
C. In 3NF, but not in BCNF keys are AD, BD, ED and FD. In all of the
following FDs, the non-prime attributes are
D. In BCNF dependent on a partial candidate key. A ->
BC B -> CFH F -> EG
Copyright © 2021, ABES Engineering College
Question
Q. The best normal form of relation scheme R(A, B, C, D) along with the set of functional
dependencies F = {AB → C, AB → D, C → A, D → B} is (UGC NET 2014)
A. Boyce-Codd Normal form
B. Third Normal form
C. Second Normal form
D. First Normal form
AB is the candidate key. {C -> A} & {D -> B} are not in
BCNF as (C) & (D) are not keys. The relation is in 3NF as
(AB) is key in {AB → C, AB → D} and (A) & (B) are prime
attributes in {C → A, D → B}
Copyright © 2021, ABES Engineering College
Question
Q. Consider the following four relational schemas. For each schema, all non-trivial functional dependencies are listed,
The underlined attributes are the respective primary keys.
Schema I: Registration (rollno, courses) Field ‘courses’ is a set-valued attribute containing the set of courses a
student has registered for. Non-trivial functional dependency {rollno → courses}
Schema II: Registration (rollno, coursid, email) Non-trivial functional dependencies: {rollno, courseid →
email}, {email → rollno}
Schema III: Registration (rollno, courseid, marks, grade) Non-trivial functional dependencies: {rollno, courseid, →
marks, grade}, {marks → grade}
Schema IV: Registration (rollno, courseid, credit) Non-trivial functional dependencies: {rollno, courseid →
credit}, {courseid → credit}
Which one of the relational schemas above is in 3NF but not in BCNF? (GATE 2018)
A. Schema I
B. Schema II
C. Schema III
D. Schema IV
Copyright © 2021, ABES Engineering College
Question(GATE 2016)
A database of research articles in a journal uses the following schema.
(VOLUME, NUMBER, STARTPAGE, ENDPAGE, TITLE, YEAR, PRICE)
The primary key is (VOLUME, NUMBER, STARTPAGE, ENDPAGE)
and the following functional dependencies exist in the schema.
(VOLUME, NUMBER, STARTPAGE, ENDPAGE) → TITLE
(VOLUME, NUMBER) → YEAR
(VOLUME, NUMBER, STARTPAGE, ENDPAGE) → PRICE
The database is redesigned to use the following schemas.
(VOLUME, NUMBER, STARTPAGE, ENDPAGE, TITLE, PRICE)
(VOLUME, NUMBER, YEAR)
Which is the weakest normal form that the new database satisfies, but the old one does not?
A. 1NF
B. 2NF
C. 3NF
D. BCNF
Copyright © 2021, ABES Engineering College
Question(ISRO 2017)
Consider the following table : Faculty (facName, dept, office, rank, dateHired)
(Assume that no faculty member within a single department has same name. Each faculty
member has only one office identified in office). 3NF refers to third normal form and BCNF
refers to Boyee-Codd Normal Form FACNAM DEP OFFIC DATEHIR
RANK
Then Faculty is:- E T E ED
Professo
A. Not in 3NF, in BCNF Ravi Art A101 1975
r
B. In 3NF, not in BCNF Murali Math M201 Assistant 2000
C. In 3NF, in BCNF Narayana
Art A101
Associat
1992
D. Not in 3NF, not in BCNF n e
Professo
Lakshmi Math M201 1982
r
FDs:- Professo
Mohan CSC C101 1980
1. facName , dept-> office, rank, dateHired r
2. Office -> Dept Associat
Sreeni Math M203 1990
FD facName, dept → office, rank, datehired is in 3 NF as well as ein BCNF,
because facName, Dept is the primary key. But FD office → dept is not in BCNF because
Instructo
office is not superkey but is in 3 NF as deptTanuja CSC C101
is the prime attribute.
r
2001
So, overall relation Faculty is in 3 NF but not in BCNF.
Copyright © 2021, ABES Engineering College Associat
Question(GATE 2003)
Consider the following functional dependencies in a database:
Data_of_Birth → Age
Age → Eligibility
Name → Roll_number
Roll_number → Name
Course_number → Course_name
Course_number → Instructor
(Roll_number, Course_number) → Grade
The relation (Roll_number, Name, Date_of_birth, Age) is:
A. In 2NF but not in 3NF
B. In 3NF but not in BCNF
C. In BCNF
D. None of the above
Copyright © 2021, ABES Engineering College
QuestionUGC NET 2016)
Which of the following statements is TRUE?
D1 : The decomposition of the schema R(A, B, C) into R1(A, B) and R2 (A, C)
is always lossless.
D2 : The decomposition of the schema R(A, B, C, D, E) having AD → B, C →
DE, B → AE and AE → C, into R1 (A, B, D) and R2 (A, C, D, E) is lossless.
A. Both D1 and D2
B. Neither D1 nor D2
C. Only D1
D. Only D2
Only D2 is True because AD is key and present in both
the tables.
D1 is not always true because FD’s not given and if we
take B->A and C->A then it is lossy decomposition
Copyright © 2021, ABES Engineering College
QuestionUGC NET 2017)
Consider a schema R(A, B, C, D) and following functional dependencies.
A → B
B → C
C → D
D → B
Then decomposition of R into R1 (A, B), R2(B, C) and R3(B, D) is __________ .
A. Dependency preserving and lossless join.
B. Lossless join but not dependency preserving.
C. Dependency preserving but not lossless join.
D. Not dependency preserving and not lossless join.
Schema R(A, B, C, D) is decomposed into three relation →
R1 (A, B), R2(B, C) and R3(B, D)
Now dependencies derived from R1 (A, B) are: A → B
Dependencies derived from R1 (B, C) are: B → C, C → B
Dependencies derived from R1 (B, D) are: D → B, B → D
All the dependencies are preserved and it is a lossless
decomposition.
So,ABES
Copyright © 2021, option (A)
Engineering is correct.
College
QuestionUGC NET 2017)
Consider a schema R(MNPQ) and functional dependencies M → N, P → Q.
Then the decomposition of R into R1 (MN) and R2(PQ) is________.
A. Dependency preserving but not lossless join
B. Dependency preserving and lossless join
C. Lossless join but not dependency preserving
D. Neither dependency preserving nor lossless join.
Copyright © 2021, ABES Engineering College
QuestionGATE 1999)
Consider the schema R= ( S, T, U, V ) and the dependencies S→T, T→U, U→V
and V→S. Let R (R1 and R2) be a decomposition such that R1∩R2 ≠ Ø. The
decomposition is:
A. Not in 2NF
B. In 2NF but not in 3NF
R1∩R2 ≠ Ø means there is common
C. In 3NF but not in 2NF attribute in R1 and R2. Now if we
D. In both 2NF and 3NF choose a decomposition positively then
we can choose something like R1(S, T,
U) and R2(U, V) then we can say that
decomposition is lossless because
common attribute is U and LHS of every
FDs are candidate key, therefore it is
in 2NF as well as 3NF. Option (D) is
correct.
Copyright © 2021, ABES Engineering College
QuestionGATE 2001 & ISRO 2014)
Consider a schema R(A,B,C,D) and functional dependencies A->B and C->D.
Then the decomposition of R into R1(AB) and R2(CD) is:-
A. Dependency preserving and lossless join
B. Lossless join but not dependency preserving
C. Dependency preserving but not lossless join
D. Not dependency preserving and not lossless loin
Dependency Preserving Decomposition:
Decomposition of R into R1 and R2 is a dependency preserving decomposition if closure of
functional dependencies after decomposition is same as closure of of FDs before
decomposition.
A simple way is to just check whether we can derive all the original FDs from the FDs present
after decomposition.In the above question R(A, B, C, D) is decomposed into R1 (A, B) and
R2(C, D) and there are only two FDs A -> B and C -> D. So, the decomposition is dependency
preserving
Copyright © 2021, ABES Engineering College
Lossless-Join Decomposition:
QuestionGATE 2001)
R(A,B,C,D) is a relation. Which of the following does not have a lossless
join, dependency preserving BCNF decomposition?
We know that for lossless decomposition common attribute
should be candidate key in one of the relation.
A. A->B, B->CD A) A->B, B->CD
R1(AB) and R2(BCD)
B. A->B, B->C, C->D B is the key of second and hence decomposition is lossless.
C. AB->C, C->AD B) A->B, B->C, C->D
R1(AB) , R2(BC), R3(CD)
D. A ->BCD B is the key of second and C is the key of third, hence
lossless.
C) AB->C, C->AD
R1(ABC), R2(CD)
C is key of second, but C->A violates BCNF condition in ABC as
C is not a key. We cannot decompose ABC further as AB-
>C dependency would be lost.
D) A ->BCD
Copyright © 2021, ABES Engineering College
Already in BCNF.
Question
Q. Relation R is decomposed using a set of functional dependencies, F and relation S is
decomposed using another set of functional dependencies G. One decomposition is
definitely BCNF, the other is definitely 3NF, but it is not known which is which. To make
a guaranteed identification, which one of the following tests should be used on the
decompositions? (Assume that the closures of F and G are available). (GATE 2002)
Answer is (C) since to identify BCNF we need BCNF
definition. One relation which satisfies will be in BCNF
and other will be in 3NF.
A. Dependency-preservation 1st is wrong because dependency may be preserved by both
3NF and BCNF.
B. Lossless-join 2nd is wrong Because both 3NF and BCNF decomposition can
be lossless.
C. BCNF definition 4th is wrong because 3NF and BCNF both are in 3NF also.
D. 3NF definition
Copyright © 2021, ABES Engineering College
Question
Q. Let the set of functional dependencies
F = {QR → S, R → P, S → Q}
hold on a relation schema X = (PQRS). X is not in BCNF. Suppose X is decomposed into two schemas Y and Z, where
Y = (PR) and Z = (QRS).
Consider the two statements given below.
I. Both Y and Z are in BCNF
II. Decomposition of X into Y and Z is dependency preserving and lossless
Which of the above statements is/are correct? (GATE 2019)
I only
Neither I nor II
II only
Both I and II
Copyright © 2021, ABES Engineering College
Thank You
110
Copyright © 2021, ABES Engineering College