0% found this document useful (0 votes)
23 views53 pages

What Is Functional Dependencyand Normalization-Final Updated 28 Oct 2020

Dbms notes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views53 pages

What Is Functional Dependencyand Normalization-Final Updated 28 Oct 2020

Dbms notes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 53

What is functional dependency?

• Functional Dependency is a relationship that exists between


multiple attributes of a relation.
• If P is a relation with A and B attributes, a functional
dependency between these two attributes is represented as
{A → B}. It specifies that,
A It is a determinant set.
B It is a dependent attribute.
A functionally determines B.
{A → B}
B is a functionally dependent on A.

• Each value of A is associated precisely with one B value. A


functional dependency is trivial if B is a subset of A.
• 'A' Functionality determines 'B' {A → B} (Left hand side
attributes determine the values of Right hand side attributes).
Employee

•In the above <Employee> table, EmpName (employee name) is


functionally dependent on EmpId (employee id) because the EmpId is
unique for individual names.
•The EmpId identifies the employee specifically, but EmpName cannot
distinguish the EmpId because more than one employee could have the
same name.

Advantages of Functional Dependency


•Functional Dependency avoids data redundancy where same data
should not be repeated at multiple locations in same database.
•It maintains the quality of data in database.
•It allows clearly defined meanings and constraints of databases.
•It helps in identifying bad designs.
•It expresses the facts about the database design.
• Introduction to Axioms Rules
• Armstrong's Axioms is a set of rules.
• It provides a simple technique for reasoning about functional
dependencies.
• It was developed by William W. Armstrong in 1974.
• It is used to infer all the functional dependencies on a relational database.
• Various Axioms Rules

• A. Primary Rules
Rule 1Reflexivity
If A is a set of attributes and B is a subset of A, then A holds B. { A → B }
• Rule 2Augmentation
If A hold B and C is a set of attributes, then AC holds BC. {AC → BC}
It means that attribute in dependencies does not change the basic
dependencies.
• Rule 3Transitivity
If A holds B and B holds C, then A holds C.
If {A → B} and {B → C}, then {A → C}
A holds B {A → B} means that A functionally determines B.
B. Secondary Rules
• Trivial Functional Dependency
• Trivial − If a functional dependency (FD) X → Y
holds, where Y is a subset of X, then it is called a
trivial FD. Trivial FDs always hold.
• Non-trivial − If an FD X → Y holds, where Y is not
a subset of X, then it is called a non-trivial FD.
• Completely non-trivial − If an FD X → Y holds,
where x intersect Y = Φ, it is said to be a
completely non-trivial FD.
Basic of Normalization

• Database Normalization is a procedure of


making the database consistent by reducing
the redundancies and ensuring the data
integration through lossless decomposition.
• Normalization is done through normal forms.
a b c d
• Advantages of Normalization :
• Greater overall database organization will be gained.
• The amount of unnecessary redundant data reduced.
• Data integrity is easily maintained within the database.
• The database & application design processes are much for flexible.
• Security is easier to maintain or manage.

• Disadvantages of Normalization :
• The disadvantage of normalization is that it produces a lot of
tables with a relatively small number of columns. These
column then have to be joined using their primary/foreign
key relation ship.
• This has two disadvantages.
– Performance: all the joins required to merge data slow
processing & place additional stress on your hardware.
– Complex queries: developers have to code complex
queries in order to merge data from different tables.
ANOMALIES

• Reducing redundant values in tuples saves storage space and avoid


anomalies:
- Insertion anomalies.
- Deletion anomalies.
- Modification anomalies.

• Insert Anomalies
• Inserting a dept with no employee info – null values need to assign,
which will create problems
• Inconsistency problem with insertion of new tuple
Deletion Anomalies
– If we delete last employee, dept info is deleted.
-Modification anomalies –
if we change manager of department 5, we must update all the tuples

10
ANOMALIES
• 1. First Normal Form (1NF)-
• A given relation is called in First Normal Form
(1NF) if each cell of the table contains only an
atomic value.
• OR
• A given relation is called in First Normal Form
(1NF) if the attribute of every tuple is either
single valued or a null value.
However, we can bring this relation into 1NF by re-writing the above table in
a way that each cell of the table has only one value in it as shown below-

By default, every relation is in 1NF because the formal definition of a


relation states that value of all the attributes must be atomic.
• (Fname, Lastname)-- student name
• Fname- stduent name x
Second Normal Form (2NF)

• For a table to be in the Second Normal Form,


• It should be in the First Normal form.
• And, it should not have Partial Dependency.
• FD1- A,B-C- FULLY FUNCTIONAL DEPENDENCY
A B C D E
• FD2- A,B-D FULLY FUNCTIONAL DEPENDENCY
• FD3-A-E PARTIAL FUNCTIONAL DEPENDENCY
A B C D

A E
C-001 c 220 10
C-002 C++ 250 05
C-004 Dbms 300 08
C-005 C 220 12
C-006 C++ 250 07
C-007 C 200 20
Bookorder -
Table

FD1 Order-Table

FD2
Book- table
BOOKNANE Bprice

Insertion anomalies: cannot insert book information , without any order.


Deletion anomalies: if the order cancel then book details also deleted
Updation anomalies: if book price changed then we have to update it on all places
How to remove anomalies or
converted into 2nf
• Decompose Bookorder table into two
tables, order table and book table
• Book table

order table

Insertion anomalies: cannot insert book information , without any order. :- REMOVED
Deletion anomalies: if the order cancel then book details also deleted :- REMOVED
Updation anomalies: if book price changed then we have to update it on all places :-
REMOVED
3NF (A->B, B->C== AC)
• A relation is in third normal form if it is in 2NF
and no non key attribute is transitively
dependent on the primary key.
• FD1 Studentdetail-Table

• FD1
SID SNAME TCID TCNAME TCQual
• FD2 FD3

Insertion anomalies: cannot insert Teacher info without any student enrollment
Deletion anomalies: if the student left corresponding teacher details also deleted
Updation anomalies: if Teacher Qualification changed i.e Mtech to Ph.d then we have
to update it on all places
How to remove anomalies or
converted into 3nf
Decompose studentdetail table into two tables, student table and
Teachertable
Student-Table

SID SNAME

Teacher-Table

TCID TCNAME TCQual


Insertion anomalies: cannot insert Teacher info without any student enrollment:-
removed
Deletion anomalies: if the student left corresponding teacher details also deleted:-
removed
Updation anomalies: if Teacher Qualification changed i.e Mtech to Ph.d then we have
to update it on all places:- removed
Boyce-Codd Normal Form (BCNF)
• Rules for BCNF
• For a table to satisfy the Boyce-Codd Normal
Form, it should satisfy the following two
conditions:
• It should be in the Third Normal Form.
• And, for any dependency A → B, A should be a
super key.
• In simple words, it means, that for a
dependency A → B, A cannot be a non-prime
attribute, if B is a prime attribute. …
ROLLNO NAME VOTERID AGE
1 A 1111 23
2 B 2222 24
3 C 3333 25
4 D 4444 F.D’s 21

RollnoName
C.K- Rollno , voterid Rollnovoterid
Voteridage
Voterid->rollno
• Let us see an example − <SportsClub>
Ground Begin_Time End_Time Package

G01 07:00 09:00 Gold

G01 10:00 12:00 Gold

G01 10:30 11:00 Bronze

G02 10:15 11:15 Silver

G02 08:00 09:00 Silver

The above relation is in 1NF, 2NF, 3NF, but not in BCNF. Here is the reason −
Functional Dependency {Package->Ground}
It has the determinant attribute Package on which Ground depends on is neither a
Candidate Key nor a superset of the candidate key.
<Package>
Package Ground

Gold G01

Silver G02

Bronze G01
TomorrowBookings
< >

Ground Begin_Time End_Time

G01 07:00 09:00

G01 10:00 12:00

G01 10:30 11:00

G02 10:15 11:15

G02 08:00 09:00


• Now the above tables are in BCNF.
• Candidate key for <Package> table are
Package and Ground
• Candidate key
for <TomorrowBookings> table
are {Ground, Begin_Time} and {Ground,
End_Time}
• The anomaly eliminated because we used
Package as a key in
the <Package> relation.
# Multivalued Dependencies
(a) A multivalued dependency exists when there are at least
3 attributes (like X,Y and Z) in a relation and for value of X
there is a well defined set of values of Y and a well
defined set of values of Z. However, the set of values of Y
is independent of set Z and vice versa.Primary Key:(XYZ)
What is 4NF?
• To be in 4NF, a relation should be in Bouce-Codd Normal Form
and may not contain more than one multi-valued attribute.
• Example
Movie_Name Shooting_Location Listing
• <Movie>
MovieOne UK Comedy

MovieOne UK Thriller

MovieTwo Australia Action

MovieTwo Australia Crime

MovieThree India Drama


• The above is not in 4NF, since
• More than one movie can have the same listing
• Many shooting locations can have the same movie
• Let us convert the above table in 4NF −
<Movie_Shooting> Movie_Name Shooting_Location

MovieOne UK

MovieOne UK

MovieTwo Australia

MovieTwo Australia

MovieThree India
<Movie_Listing>
Now the violation is removed and the tables are in 4NF.

Movie_Name Listing

MovieOne Comedy

MovieOne Thriller

MovieTwo Action

MovieTwo Crime

MovieThree Drama
5NF or PJN
• The 5NF (Fifth Normal Form) is also known as project-join normal form. A
relation is in Fifth Normal Form (5NF), if it is in 4NF, and won’t have
lossless decomposition into smaller tables.
• If we want to add semester 7, but we cannot do it as P.K can not be null
Subject Professor Semester
NULL Nil 7
# 4th Normal Form (4NF)

For a table to satisfy the Fourth Normal Form, it


should satisfy the following two conditions:

1.It should be in the Boyce-Codd Normal Form.

2.The table should not have any non-trivial Multi-


valued Dependency. And for this the relation
should have at-least 3 attributes.

Slide 11-
# Fourth Normal Form
(4NF)
(a) The EMP relation with 2 MVDs: ENAME —>> PNAME and
ENAME —>> DNAME.
(b) Decomposing the EMP relation into two 4NF relations
EMP_PROJECTS and EMP_DEPENDENTS.

Slide 11-
Multivalued Dependencies and 4th Normal Form
Decomposing a relation state of EMP that is not in 4NF:
(a)EMP relation with additional tuples.
(b)2 corresponding 4NF relations EMP_PROJECTS and
EMP_DEPENDENTS.

Slide 11-
Multivalued Dependencies and 4th Normal Form
Decomposing a relation state of EMP that is not in 4NF:
(a)The COURSE relation with 2 MVDs: SUBJECT —>> LECTURER and
SUBJECT —>> BOOKS.
(b)Decomposing the COURSE relation into two 4NF relations.

Slide 11-
# Join Dependencies and Fifth Normal Form (5NF)
Join Dependency Definition:

A relation R is subject to a join dependency or we can say that


a relation R is having join dependency if R can always be
recreated by joining multiple tables each having a subset of the
attributes of R. If one of the relation in the join has all the
attributes of the relation R, the join dependency is called trivial.
If a relation can be recreated by joining multiple tables (R1,

R2, R3….Rn) and each of this table have a subset of the


attributes of the table, then the table is having Join
Dependency.

Slide 11-
# Join Dependencies and Fifth Normal Form (5NF)
Join Dependency Definition:

Let ‘R’ be a relation schema and R1,R2,….Rn be the


decomposition of R. Then R is said to satisfy the join

∏R1(R) ⟗ ∏R2(R) ⟗ …….. ⟗ ∏Rn(R) = R


dependency JD(R1,R2,….Rn) if and only if:

Join Dependency Rule:


JD holds good only if a relation can be retransformed
back without any loss of information from the join of
certain specified projection (sub-relations) on it.
JD holds good only for a relation if the join of certain

specified projection (sub-relations) on it does not have any


extra, missing or false(spurious) tuples.
Slide 11-
# Fifth Normal Form (5NF)
It is also known as Project-Join Normal Form
(PJNF). A relation R is in 5NF if:
R is already in 4NF.

It can-not be further non-loss decomposed or if it is

not having any join dependency.


•Join Dependency can be related to 5NF, wherein a relation R
is in 5NF if and only if it is already in 4NF and it cannot be
decomposed further or in other words we can say that if R is
not having any Join dependency.

•If a relation is in 4NF and having Join dependency means if it


can be decomposed further than it is not in 5NF. After
decomposing the resultant sub relations will be in 5NF.

Slide 11-
# Fifth Normal Form
(5NF)
(c) The relation SUPPLY with no MVDs is in 4NF but not in 5NF as it has the join
dependency : JD(R1, R2, R3).
(d) Decomposing the relation SUPPLY into the 5NF relations R1, R2, and R3.

The relation SUPPLY has the

as R1 ⟗ R2 ⟗ R3 = R
join dependency: JD(R1, R2, R3)

So table SUPPLY is not in 5NF

5NF Relations: R1, R2 and R3

Slide 11-
• Closure of Attribute Sets
• Given a set  of attributes of R and a set of functional
dependencies F, we need a way to find all of the attributes
of R that are functionally determined by . This set of
attributes is called the closure of  under F and is denoted
+. Finding + is useful because:
• if + = R, then  is a superkey for R
• if we find + for all  R, we've computed F+ (except that
we'd need to use decomposition to get all of it).
• An algorithm for computing +:
• result :=  repeat
• temp := result
• for each functional dependency    in F do
• if   result then
• result := result  
• until temp = result
• Problem:

• Compute the closure for relational schema
R={A,B,C,D,E}
A-->BC
CD-->E
B-->D
E-->A
List candidate keys of R.
• Solution:
• R={A,B,C,D,E}
• F, the set of functional dependencies A-->BC, CD-->E, B-->D, E-->A
• Compute the closure for each  in    in F
• Closure for A

A+ = ABCDE, Hence A is a super key


• Closure for CD

CD+ = ABCDE, Hence CD is a super key

Closure for B

B+ = BD, Hence B is NOT a super key


Closure for BC
BC+ = ABCDE, , Hence BC is a
super key

Closure for E

E+ = ABCDE
A and E are minimal super keyas.
• To see whether CD is a minimal super key, check whether
its subsets are super keys.
• C+ = C
• D+ = D
• Since C and D are not super keys, CD is a minimal super key.

• To see whether BC is a minimal super key, check whether
its subsets are super keys.
• B+ = BD
• C+ = C
• Since B and C are not super keys, BC is a minimal super key.

• Since A, BC, CD, E are minimal super keys, they are the
candidate keys.
• A, BC, CD, E
• Let R = (A, B, C, D, E, F) be a relation
scheme with the following dependencies-
• C→F
• E→A
• EC → D
• A→B
• Which of the following is a key for R?
1.CD
2.EC
3.AE
4.AC
• Consider the relation scheme R(A, B, C, D, E, H) and the set of functional
dependencies-

• A→B

• BC → D

• E→C

• D→A

• What are the candidate keys of R?

• AE, BE
• AE, BE, DE
• AEH, BEH, BCH
• AEH, BEH, DEH

You might also like