Database Design Schema Refinement
Database Design Schema Refinement
Schema Refinement
Topics
Minimizing Redundancy
Requirements Analysis
Conceptual Modeling (ER Model)
Logical Modeling (Relational Model)
Schema Refinement (Normalization)
Redundancy
Wastage of Space
Update Anomalies
Update Anomaly
Insert Anomaly
Delete Anomaly
Update Anomalies
Consider the relation:
EMP_PROJ ( Emp#, Proj#, Ename, Pname, No_hours)
Update Anomaly:
Insert Anomaly:
Update Anomalies
Consider the relation:
EMP_PROJ ( Emp#, Proj#, Ename, Pname, No_hours)
Delete Anomaly:
Solution
Decompose the relation:
EMP_PROJ ( Emp#, Proj#, Ename, Pname, No_hours)
Into the following smaller relations:
EMP (Emp#, Ename)
PROJ (Proj#, Pname)
EMP_PROJ ( Emp#, Proj#, No_hours)
Redundancy
Functional
Dependencies
Functional Dependencies
(t2)
i.e., given two tuples in r, if the X values agree, then the Y
values must also agree. (X and Y are sets of attributes.)
However, K
require K to be minimal!
R does not
Prof. Navneet Goyal, BITS, Pilani
Functional Dependencies
1
1
3
On this instance, A B
hold.
4
5
7
does NOT hold, but B A does
Functional Dependencies
A B
As
Bs
u
If t & u Then
agree they must
here
agree
here
Prof. Navneet Goyal, BITS, Pilani
Functional Dependencies
K R, and
for no K, R
Functional Dependencies
Example:
customer_name, loan_number
customer_name
customer_name customer_name
In general, is trivial if
Functional Dependencies
Functional Dependencies
PLOTS
Prop#
State
Plot#
Area
Price
Tax_rate
FD1
PK
FD2
CK
FD3
FD4
Functional Dependencies
PLOTS
Prop#
State
Plot#
Area
FD1
PK
FD2
CK
State
FD3
Area
Tax_rate
FD4
Price
Normal Forms
2 NF
3 NF
Boyce-Codd Normal Form (BCNF)
4 NF (Multivalued Dependencies)
5 NF (Join Dependencies)
Deal with very rare practical situations
2 NF
2 NF
Remove all
2 NF
Partial Dependencies
3 NF
3 NF
PLOTS
Prop#
State
Plot#
Area
Price
Tax_rate
FD1
PK
FD2
CK
FD3
FD4
3 NF
Remove all
3 NF
Transitive Dependencies
BCNF
Problem 1
Problem 2
Example
R = (A, B, C, G, H, I)
F={ AB
AC
CG H
CG I
B H}
some members of F+
AH
by transitivity from A B and B H
AG I
by augmenting A C with G, to get AG CG
and then transitivity with CG I
CG HI
By union rule
Prof. Navneet Goyal, BITS, Pilani
result := ;
while (changes to result) do
for each in F do
begin
if result then result := result
end
Try to find out why this algorithm works!
Complexity of this algorithm
Can you do any better?
Prof. Navneet Goyal, BITS, Pilani
R = (A, B, C, G, H, I)
F = {A B, A C, CG H, CG I, B H}
(AG)+
1. result
2. result
3. result
4. result
=
=
=
=
AG
ABCG (A C and A B)
ABCGH (CG H and CG AGBC)
ABCGHI (CG I and CG AGBCH)
Is AG a candidate key?
1.
Is AG a super key?
1. Does AG R? == Is (AG)+ R
2.
Computing closure of F
Canonical Cover
{A B,
B C,
{A B, B C, A D}
{A B, B C, AC D} can be simplified
{A B,
B C,
A D}
Extraneous Attributes
Attribute A is extraneous in if A
and F logically implies (F { }) {( A) }.
Attribute A is extraneous in if A
and the set of functional dependencies
(F { }) { ( A)} logically implies F.
B is extraneous in AB C because {A C, AB C}
logically implies A C (I.e. the result of dropping B from
AB C).
Testing if an Attribute is
Extraneous
2.
Canonical Cover
R = (A, B, C)
F = {A BC, B C, A B, AB C}
Combine A BC and A B into A BC
A is extraneous in AB C
C is extraneous in A BC
A B, B C
Problems with
Decompositions
There are three potential problems to
consider:
Some queries become more expensive
e.g., What is the price of prop# 1?
Tradeoff:
Lossy Decomposition
Spurious
Tuples
A
1
4
7
A
1
4
7
1
7
B
2
5
2
B
2
5
2
2
2
C
3
6
8
C
3
6
8
8
3
A
1
4
7
B
2
5
2
B
2
5
2
C
3
6
8
JOIN
Note that we
can never
get anythng
less than the
original
relation
Since we
dont know
which tuples
are spurious
and which
are genuine,
we have
indeed lost
information
Lossy Decomposition
S#
Status City
S3
30
Paris
S5
30
Athens
S#
Status
S#
City
S3
30
S3
Paris
S5
30
S5
Athens
S#
Status
Status
City
S3
30
30
Paris
S5
30
30
Athens
Lossless Decomposition
Lossless Decomposition
r = R1 (r )
R2 (r )
Rn (r )
Lossless Decomposition
Theorem
A decomposition of R into R1 and R2 is
lossless join wrt FDs F, if and only if at
least one of the following dependencies is
in F+:
R1 R2 R1
R1 R2 R2
In other words, R1 R2 forms a superkey
of either R1 or R2
Dependency Preservation
result =
while (changes to result) do
for each Ri in the decomposition
t = (result Ri)+ Ri
result = result t
Example
R = (A, B, C )
F = {A B, B C}
Key = {A}
R is not in BCNF
Decomposition R1 = (A, B), R2 = (B, C)
R1 and R2 in BCNF
Lossless-join decomposition
Dependency preserving
4 NF
4 NF
Course
Teacher
Texts
DBS
N Goyal
J P Misra
Garcia
Raghu
ADBS
J P Misra
Connolly
Garcia
4 NF
1 NF Version
CTX
COURSE
TEACHER
TEXTS
DBS
N GOYAL
GARCIA
DBS
N GOYAL
RAGHU R
DBS
J P MISRA
GARCIA
DBS
J P MISRA
RAGHU R
ADBS
J P MISRA
GARCIA
ADBS
J P MISRA
CONNOLLY
4 NF
CTX
BCNF?
COURSE
TEACHER
TEXTS
DBS
N GOYAL
GARCIA
DBS
N GOYAL
RAGHU R
DBS
J P MISRA
GARCIA
DBS
J P MISRA
RAGHU R
ADBS
J P MISRA
GARCIA
ADBS
J P MISRA
CONNOLLY
4 NF
Anomalies?
CTX
MANY!!
COURSE
TEACHER
TEXTS
DBS
N GOYAL
GARCIA
DBS
N GOYAL
RAGHU R
DBS
J P MISRA
GARCIA
DBS
J P MISRA
RAGHU R
ADBS
J P MISRA
GARCIA
ADBS
J P MISRA
CONNOLLY
4 NF
Anomalies
New Teacher for DBS
New Text for ADBS
Teacher teaching DBS leaves
4 NF
Points to note:
4 NF
Decompose
& TX
CTX into CT
TX
CT
COURSE
TEACHER
DBS
N GOYAL
DBS
J P MISRA
ADBS
J P MISRA
COURSE
TEXT
DBS
GARCIA
DBS
RAGHU R
ADBS
GARCIA
ADBS
CONNOLLY
4 NF
course
teacher
course
text
4 NF
Interpretation of course
teacher
4 NF
Formal Definition
Let R be a relation and A,B,C be subsets of attributes
of R, then we say that
A B
iff, in every possible legal value of R, the set of B
values matching a given (A,C) pair depends only on
the value of A and is independent of the C value.
It can be easily shown that for R(A,B,C), the MVD A
B hold iff the MVD A C also holds.
MVDs always go together in pairs and we write them
as
A B | C
course teacher | text
Prof. Navneet Goyal, BITS, Pilani
4 NF
Fagin Theorem
4 NF
An MVDs A B is trivial if
(a) B A or
(b) A U B = R
A relation that is in BCNF & contains no
non-trivial MVDs is said to be in 4NF
CTX is not in 4NF because course
is a non trivial MVD
teacher
Multi-Valued Dependencies
The MVD
A1A2.An B1B2Bm
holds for a relation R if
for each pair of tuples t & u that
agree on As, we can find a tuple v
that agrees
1.
2.
3.
With t & u on As
With t on Bs
With u on all attributes of R that are
not among As & Bs
Prof. Navneet Goyal, BITS, Pilani
MVD
A B
As
Bs
Others
t
v
u
Problem Solving
Q&A
Thank You