0% found this document useful (0 votes)
46 views

Normalization

The document discusses different forms of database normalization: - Normalization is a technique that reduces redundancy and eliminates anomalies when inserting, updating, and deleting data. - The goal of normalization is to store data logically and break large tables into smaller tables linked by relationships. - Normalization was first proposed by Codd and involves testing relations against normal forms like 1NF, 2NF, 3NF, BCNF, 4NF and 5NF.

Uploaded by

Kevin Koshy
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

Normalization

The document discusses different forms of database normalization: - Normalization is a technique that reduces redundancy and eliminates anomalies when inserting, updating, and deleting data. - The goal of normalization is to store data logically and break large tables into smaller tables linked by relationships. - Normalization was first proposed by Codd and involves testing relations against normal forms like 1NF, 2NF, 3NF, BCNF, 4NF and 5NF.

Uploaded by

Kevin Koshy
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 42

SRM

Institute of Science and


Technology
18CSC303J
Database Management System
Unit- IV
Normalization
Normalization Process

• Normalization is a database design technique that


reduces data redundancy and eliminates undesirable
characteristics like Insertion, Update and Deletion
Anomalies.
• Normalization rules divides larger tables into smaller
tables and links them using relationships.
• The purpose of Normalisation in SQL is to eliminate
redundant (repetitive) data and ensure data is stored
logically.
Normalization Process
• First proposed by Codd (1972a),

• Takes a relation schema through a series of tests to certify


whether it satisfies a certain normal form.

• The process, proceeds in a top-down fashion by


– evaluating each relation against the criteria for normal
forms and
– decomposing relations as necessary, can thus be considered
as relational design by analysis.

• Initially, Codd proposed three normal forms, which he


called first, second, and third normal form.
Normalization Process
• A stronger definition of 3NF—called Boyce-Codd normal
form (BCNF)—was proposed later by Boyce and Codd.

• All these normal forms are based on a single analytical


tool:
– the functional dependencies among the attributes of a
relation.

• A fourth normal form (4NF) and a fifth normal form


(5NF) were proposed,
– based on the concepts of multivalued dependencies and
join dependencies, respectively;
Normalization
• A process of analyzing the given relation schemas based
on their
– FDs and primary keys

• To achieve the desirable properties of


– minimizing redundancy and
– minimizing the insertion, deletion, and update anomalies
Types of Normal Forms
Functional Dependency
• A relationship that exists between two attributes.

• Typically exists between the primary key and non-key


attribute within a table.

• The left side of FD is known as a determinant,

• the right side of the production is known as a


dependent.
Example
• Assume we have an employee table with attributes:
Emp_Id, Emp_Name, Emp_Address.

• Emp_Id attribute can uniquely identify the Emp_Name


attribute of employee table because if we know the
Emp_Id, we can tell that employee name associated with
it.
Trivial functional dependency
• A → B has trivial functional dependency if B is a subset
of A.

• The following dependencies are also trivial like:

A → A, B → B
Non-trivial functional dependency
• A → B has a non-trivial functional dependency if B is not
a subset of A.

• When A intersection B is NULL, then A → B is called as


complete non-trivial.
First Normal Form (1NF)
• A relation will be 1NF if it contains an atomic value.
– An attribute of a table cannot hold multiple values.
– must hold only single-valued attribute.

• Disallows the multi-valued attribute, composite


attribute, and their combinations.

• Example:
– Relation EMPLOYEE is not in 1NF because of multi-valued
attribute EMP_PHONE
Employee
Table
Second Normal Form (2NF)
• In the 2NF, relational must be in 1NF.

• All non-key attributes are fully functional dependent on


the primary key

Teachers Table
Second Normal Form (2NF)
• In the given table, non-prime attribute TEACHER_AGE is
dependent on TEACHER_ID which is a proper subset of a
candidate key.

• To convert the given table into 2NF, we decompose it


into two tables:

Teachers Table
TEACHER_DETAIL table

TEACHER_SUBJECT
table
Third Normal Form (3NF)
• A relation will be in 3NF
– it is in 2NF and
– not contain any transitive partial dependency.

• Used to reduce the data duplication and also used to


achieve the data integrity.

• If there is no transitive dependency for non-prime


attributes, then the relation must be in third normal form.

• A relation is in third normal form if it holds atleast one of


the following conditions for every non-trivial function
dependency X → Y.
Third Normal Form (3NF)
• X is a super key.

• Y is a prime attribute, i.e., each element of Y is part of


some candidate key

EMPLOYEE_DETAIL
table
Third Normal Form (3NF)
• Super key is {EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID,
EMP_NAME, EMP_ZIP} . . .

• Candidate Key is {EMP_ID}.

• Non-prime attributes: all attributes except EMP_ID are non-


prime

• EMP_STATE & EMP_CITY dependent on EMP_ZIP and


EMP_ZIP dependent on EMP_ID.

• The non-prime attributes (EMP_STATE, EMP_CITY) transitively


dependent on super key(EMP_ZIP).
Third Normal Form (3NF)
• It violates the rule of third normal form.

• Need to move the EMP_CITY and EMP_STATE to the new


<EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.
EMPLOYEE table

EMPLOYEE ZIP table


Boyce Codd normal form (BCNF)
• BCNF is the advance version of 3NF.

• It is stricter than 3NF.

• A table is in BCNF if every functional dependency X → Y,


X is the super key of the table.

• For BCNF, the table should be in 3NF, and for every FD,
LHS is super key
Rules for BCNF
• it should satisfy the following two conditions:

1. It should be in the Third Normal Form.

2. For any dependency A → B, A should be a super key.

• In simple words, it means, that for a dependency A → B,


– A cannot be a non-prime attribute,
– if B is a prime attribute.
Functional dependencies
are

Candidate Keys are


BCNF
• The table is not in BCNF because neither EMP_DEPT nor
EMP_ID alone are keys.

• To convert the given table into BCNF, decompose it into


three tables
– Employee Country Table
– Employee Department Table
– Employee Department Mapping Table
Example 02
• college enrolment table with columns student_id, subject
and professor.

• {student_id, subject} together form the primary key,

• Using student_id and subject, we can find all the columns of


the table.
• one professor teaches only one subject, but one subject may
have two different professors.

• A dependency between subject and professor, where subject


depends on the professor name.

• This table satisfies


– the 1st Normal form - all the values are atomic, column names
are unique
– the 2nd Normal Form as their is no Partial Dependency.
– no Transitive Dependency, hence the table also satisfies the 3rd
Normal Form.

• not in Boyce-Codd Normal Form.


Why this table is not in BCNF?
• {student_id, subject} form primary key,- subject is a
prime attribute.

• There is one more dependency, professor → subject.

• And while subject is a prime attribute, professor is a non-


prime attribute, which is not allowed by BCNF.
• To make this relation(table) satisfy BCNF,
– Decompose this table into two tables,
• student table and professor table.
Fourth normal form (4NF)
• A relation will be in 4NF
– it is in Boyce Codd normal form and
– has no multi-valued dependency.

• For a dependency A → B,
– For a single value of A, multiple values of B exists, then
– The relation will be a multi-valued dependency.
Rules for 4th Normal Form
• For a table to satisfy the Fourth Normal Form, it should
satisfy the following two conditions:

1. It should be in the Boyce-Codd Normal Form.

2. And, the table should not have any Multi-valued


Dependency.
What is Multi-valued Dependency?
• A table is said to have multi-valued dependency, if the
following conditions are true,

1. For a dependency A → B, if for a single value of A,


multiple value of B exists, then the table may have multi-
valued dependency.

2. A table should have at-least 3 columns for it to have a


multi-valued dependency.

3. For a relation R(A,B,C), if there is a multi-valued


dependency between, A and B, then B and C should be
independent of each other.
Example 01

• To make 4NF, decompose the table into two tables.


– Student_Course
– Student_Hobby
Example 01
Example 01
• A table has functional dependency, along with multi-
valued dependency.

• The functionally dependent columns are moved in a


separate table

• The multi-valued dependent columns are moved to


separate tables.
Fifth normal form (5NF)
• A relation is in 5NF
– it is in 4NF and
– not contains any join dependency and
– joining should be lossless.

• 5NF is satisfied when all the tables are broken into as


many tables as possible in order to avoid redundancy.

• 5NF is also known as Project-join normal form (PJ/NF).


Fifth normal form (5NF)
• John takes both Computer and Math class for Semester 1 but he
doesn't take Math class for Semester 2.

• Combination of all these fields required to identify a valid data.

• Suppose we add a new Semester as Semester 3

• Do not know about the subject and who will be taking that subject so
we leave Lecturer and Subject as NULL.

• But all three columns together acts as a primary key, so we can't leave
other two columns blank.

• So to make the above table into 5NF, we can decompose it into three
relations P1, P2 & P3:
Fifth normal form (5NF)
• To make the above table into 5NF, we can decompose it
into three relations
Thank You

You might also like