Unit 2 notes DBMS FINAL
Unit 2 notes DBMS FINAL
The main objectives of database designing are to produce logical and physical designs models of
the proposed database system.
The logical model concentrates on the data requirements and the data to be stored independent
of physical considerations. It does not concern itself with how the data will be stored or where it
will be stored physically.
The physical data design model involves translating the logical design of the database onto
physical media using hardware resources and software systems such as database management
systems (DBMS).
Why Database Design is Important?
Database designing is crucial to high performance database system.
Apart from improving the performance, properly designed database is easy to maintain,
improve data consistency and are cost effective in terms of disk storage space.
Note, the genius of a database is in its design. Data operations using SQL is relatively simple
What is ER Diagrams?
Entity relationship diagram displays the relationships of entity set stored in a database. In other
words, we can say that ER diagrams help you to explain the logical structure of databases. At
first look, an ER diagram looks very similar to the flowchart. However, ER Diagram includes
many specialized symbols, and its meanings make this model unique.
Entity
An Entity is a real-world object that are represented in database. It can be any object, place,
person or class. Data are stored about such entities.
Examples of entities:
Person: Employee, Student, Patient
Place: Store, Building
Object: Machine, product, and Car
Event: Sale, Registration, Renewal
Concept: Account, Course
Entities are represented by means of rectangles. Rectangles are named with the entity set they
represent.
Attributes
Attributes are the properties of entities. Attributes are represented by means of ellipses. Every
ellipse represents one attribute and is directly connected to its entity (rectangle).
If the attributes are composite, they are further divided in a tree like structure. Every node is then
connected to its attribute. That is, composite attributes are represented by ellipses that are
connected with an ellipse.
Relationship
Relationship is nothing but an association among two or more entities. E.g., Tom works in the
Chemistry department.
Example-
‘Enrolled in’ is a relationship that exists between entities Student and Course.
Entities take part in relationships. We can often identify relationships with verbs or verb phrases.
Relationships are represented by diamond-shaped box. Name of the relationship is written inside
the diamond-box. All the entities (rectangles) participating in a relationship, are connected to it
by a line.
Relationship Set-
Example-
The number of entity sets that participate in a relationship set is termed as the degree of that
relationship set. Thus,
On the basis of degree of a relationship set, a relationship set can be classified into the following
types-
1. Unary relationship set
2. Binary relationship set
3. Ternary relationship set
4. N-ary relationship set
Unary relationship set is a relationship set where only one entity set participates in a relationship
set.
Example-
One person is married to only one person
Binary relationship set is a relationship set where two entity sets participate in a relationship set.
Example-
Student is enrolled in a Course
Ternary relationship set is a relationship set where three entity sets participate in a relationship
set.
Example-
4. N-ary Relationship Set-
N-ary relationship set is a relationship set where ‘n’ entity sets participate in a relationship set.
Many-to-one − When more than one instance of entity is associated with the
relationship, it is marked as 'N:1'. The following image reflects that more than one
instance of an entity on the left and only one instance of an entity on the right can be
associated with the relationship. It depicts many-to-one relationship.
Many-to-many − The following image reflects that more than one instance of an entity
on the left and more than one instance of an entity on the right can be associated with the
relationship. It depicts many-to-many relationship.
Participation Constraints
Total Participation − Each entity is involved in the relationship. Total participation is
represented by double lines.
Partial participation − Not all entities are involved in the relationship. Partial
participation is represented by single lines.
The main difference between stored and derived attribute in DBMS is that it is not possible to
find the value of a stored attribute using other attributes while it is possible to find the value
of a derived attribute using other attributes.
Database Management System (DBMS) is a software that allows storing and managing data
efficiently. It stores data in tables; these tables are also called entities. Each table has attributes.
The attributes define the characteristics or properties of an entity. For example, a student table
can have attributes such as id, name, age, location, etc. There is various type of attributes. Two
of them are stored and derived attribute.
Stored and derived attributes
Stored attributes:
The stored attribute are such attributes which are already stored in the database and from which
the value of another attribute is derived is called stored attribute. For example age of a person
can be calculated from person’s date of birth and present date. Difference between these two
dates gives the value of age. In this case, date of birth is a stored attribute and age of the person
is the derived attribute
Derived attributes:
The derived attributes are such attributes for which the value is derived or calculated from stored
attributes. For example date of birth of an employee is the stored attribute but the age is the
derived attributed. Derived attributes are usually created by a formula or by a summary operation
on other attributes. Take another example, if we have to calculate the interest on some principal
amount for a given time, and for a particular rate of interest, we can simply use the interest
formula
Interest=(N*P*R)/100;
In this case, interest is the derived attribute whereas principal amount (P), time (N) and rate of
interest(R) are all stored attributes.
Continuing our previous example, Professor is a strong entity here, and the primary key is
Professor_ID.
Weak Entity
The weak entity in DBMS do not have a primary key and are dependent on the parent entity. It
mainly depends on other entities.
Weak Entity is represented by double rectangle:
Continuing our previous example, Professor is a strong entity, and the primary key is
Professor_ID. However, another entity is Professor_Dependents, which is our Weak Entity.
<Professor_Dependents>
Name DOB Relation
This is a weak entity since its existence is dependent on another entity Professor, which we saw
above. A Professor has Dependents.
Example of Strong and Weak Entity
The example of strong and weak entity can be understood by the below figure.
A member of a strong entity set is called dominant entity and member of weak entity set is
called as subordinate entity.
0
Patient
Pat-id Primary Key
PName
PAddress
PDiagnosis
Record-id Foreign key references to Record-id of Medical Record table
Hosp-id Foreign key references to Hosp-id of Hospital table
Medical Record
Record-id Primary Key
Problem
Date_of_examination
Pat-id Foreign key references to Pat-id of Patient table
Doctor
Doc-id Primary Key
DName
Qualification
Salary
Hosp-id Foreign key references to Hosp-id of Hospital table
Composite Attributes
Composite Attributes which can be divided into subparts.
Example: Patient Name, Doctor Name
Hosp-id Patient table makes foreign key references to Hosp-id of Hospital table
Hosp_Doctor
Hosp-id Doctor table makes foreign key references to Hosp-id of Hospital table
Doc-id Hospital table makes foreign key references to Doc-id of Doctor table
PatiPPatient_MedicalRecord
Pat-id Medical Record table makes foreign key references to Pat-id of Patient table
Record-id Patient table makes foreign key references to Record-id of Medical Record table
Step 5: Identifying the relationships
a. Hospital has a set of patients.
Therefore the relations is 1……..N.
b. Hospital has a set of doctors.
Therefore the relations is 1……..N.
c. Doctor are associated with each patient.
Therefore the relations is N……..1.
d. Each patient has record of various test and examination conducted.
Therefore the relations is 1……..N.
Relational Data Model
Relational model is the most popular model and the most extensively used model. The
relational model represents the database as a collection of relations(tables). Every row in the
table represents a collection of related data values. These rows in the table denote a real-world
entity or relationship.
The table name and column names are helpful to interpret the meaning of values in each row.
The data are represented as a set of relations. In the relational model, data are stored as tables.
However, the physical storage of the data is independent of the way the data are logically
organized.
INTRODUCTION
Database integrity refers to the validity and consistency of stored data. Integrity is usually
expressed in terms of constraints, which are consistency rules that the database is not permitted
to violate.
Constraints may apply to each attribute or they may apply to relationships between tables.
Integrity constraints ensure that changes (update deletion, insertion) made to the database by
authorized users do not result in a loss of data consistency. Thus, integrity constraints guard
against accidental damage to the database.
EXAMPLE- A blood group must be ‘A’ or ‘B’ or ‘AB’ or ‘O’ only (can not any other values
else).
3.Referential Integrity Constraint-It states that if a foreign key exists in a relation then either
the foreign key value must match a primary key value of some tuple in its home relation or the
foreign key value must be null.
1. You can't delete a record from a primary table if matching records exist in a related table.
2. You can't change a primary key value in the primary table if that record has related records.
3. You can't enter a value in the foreign key field of the related table that doesn't exist in the
primary key of the primary table.
4. However, you can enter a Null value in the foreign key, specifying that the records are
unrelated.
Rule 1. You can't delete any of the rows in the ”stu” relation that are visible since all the ”stu”
are in use in the “stu_1” relation.
Rule 2. You can't change any of the ”Stu_id” in the “stu” relation since all the “Stu_id” are in
use in the ”stu_1” relation.
Rule 3. The values that you can enter in the” Stu_id” field in the “stu_1” relation must be in the”
Stu_id” field in the “stu” relation.
Rule 4 You can enter a null value in the "stu_1" relation if the records are unrelated.
4.Key Constraints- A Key Constraint is a statement that a certain minimal subset of the fields of
a relation is a unique identifier for a tuple.
There are 4 types of key constraints-
1. Candidate key.
2. Super key
3. Primary key
4. Foreign key
Candidate Key: The minimal set of attribute which can uniquely identify a tuple is known as
candidate key. For Example, STUD_NO in STUDENT relation.
The value of Candidate Key is unique and non-null for every tuple.
There can be more than one candidate key in a relation. For Example, STUD_NO as well
as STUD_PHONE both are candidate keys for relation STUDENT.
The candidate key can be simple (having only one attribute) or composite as well. For
Example, {STUD_NO, COURSE_NO} is a composite candidate key for relation
STUDENT_COURSE.
Note – In Sql Server a unique constraint that has a nullable column, allows the value ‘null‘ in
that column only once. That’s why STUD_PHONE attribute as candidate here, but can not be
‘null’ values in primary key attribute.
Super Key: The set of attributes which can uniquely identify a tuple is known as Super Key. For
Example, STUD_NO, (STUD_NO, STUD_NAME) etc.
Adding zero or more attributes to candidate key generates super key.
A candidate key is a super key but vice versa is not true.
Primary Key: There can be more than one candidate key in a relation out of which one can be
chosen as primary key. For Example, STUD_NO as well as STUD_PHONE both are candidate
keys for relation STUDENT but STUD_NO can be chosen as primary key (only one out of many
candidate keys).
Alternate Key: The candidate key other than primary key is called as alternate key. For
Example, STUD_NO as well as STUD_PHONE both are candidate keys for relation STUDENT
but STUD_PHONE will be alternate key (only one out of many candidate keys).
Foreign Key: If an attribute can only take the values which are present as values of some other
attribute, it will be foreign key to the attribute to which it refers. The relation which is being
referenced is called referenced relation and corresponding attribute is called referenced attribute
and the relation which refers to referenced relation is called referencing relation and
corresponding attribute is called referencing attribute. Referenced attribute of referenced relation
should be primary key for it. For Example, STUD_NO in STUDENT_COURSE is a foreign key
to STUD_NO in STUDENT relation.
Relational Algebra
A query language is a language in which user requests to retrieve some information from the
database. The query languages are considered as higher level languages than programming
languages. Query languages are of two types,
Procedural Language
Non Procedural Language
1. In procedural language, the user has to describe the specific procedure to retrieve the
information from the database.
Example:
The Relational Algebra is a procedural language.
2. In non procedural language, the user retrieves the information from the database without
describing the specific procedure to retrieve it.
Example:
The Tuple Relational Calculus and the Domain Relational Calculus are non procedural
languages.
Relational Algebra
The relational algebra is a procedural query language. It consists of a set of operations that take
one or two relations (tables) as input and produce a new relation, on the request of the user to
retrieve the specific information, as the output.
r ∪ s = { t | t ∈ r or t ∈ s}
It performs binary union between two given relations and is defined as −
Notation − r U s
Where r and s are either database relations or relation result set (temporary relation).
For a union operation to be valid, the following conditions must hold −
r, and s must have the same number of attributes.
Attribute domains must be compatible.
Output − Projects the names of the authors who have either written a book or an article or both.
Set Difference (−)
The result of set difference operation is tuples, which are present in one relation but are not in the
second relation.
Notation − r − s
Finds all the tuples that are present in r but not in s.
∏ author (Books) − ∏ author (Articles)
Output − Provides the name of authors who have written books but not articles.
Cartesian Product (Χ)
Combines information of two different relations into one.
Notation − r Χ s
r Χ s = { q t | q ∈ r and t ∈ s}
Where r and s are relations and their output will be defined as −
Selection (σ)
Selection is used to select required tuples of the relations.
for the above relation
σ (c>3)R
will select the tuples which have c more than 3.
Note: selection operator only selects the required tuples but does not display them.
For displaying, data projection operator is used.
For the above selected tuples, to display we need to use projection also.
π (σ R ) will show following tuples.
(c>3)
A B C
-------
1 2 4
4 3 4
Cross Product (X)
Cross product between two relations let say A and B, so cross product between A X
B will results all the attributes of A followed by each attribute of B. Each record of A
will pairs with every record of B.
below is the example
A B
(Name Age Sex ) (Id Course)
------------------ -------------
Ram 14 M 1 DS
Sona 15 F 2 DBMS
kim 20 M
A X B
Name Age Sex Id Course
---------------------------------
Ram 14 M 1 DS
Ram 14 M 2 DBMS
Sona 15 F 1 DS
Sona 15 F 2 DBMS
Kim 20 M 1 DS
Kim 20 M 2 DBMS
Note: if A has ‘n’ tuples and B has ‘m’ tuples then A X B will have ‘n*m’ tuples.
Emp Dep
(Name Id Dept_name ) (Dept_name Manager)
------------------------ ---------------------
A 120 IT Sale Y
B 125 HR Prod Z
C 110 Sale IT A
D 111 IT
Emp ⋈ Dep
Conditional Join
Conditional join works similar to natural join. In natural join, by default condition is
equal between common attribute while in conditional join we can specify the any
condition such as greater than, less than, not equal
Let us see below example
R S
(ID Sex Marks) (ID Sex Marks)
------------------ --------------------
1 F 45 10 M 20
2 F 55 11 M 22
3 F 60 12 M 59
Relational Calculus
In contrast to Relational Algebra, Relational Calculus is a non-procedural query language, that is,
it tells what to do but never explains how to do it.
Commercial query languages like SQL and QBE(Query Based Language) are influenced by
Relational Calculus.
Relational calculus exists in two forms −
Tuple Relational Calculus (TRC)
The calculus is dependent on the use of tuple variables. A tuple variable is a variable whose only
permitted values are tuples of the relation.
Notation − {T | p(T)}
Where T is a tuple variable,p(T) is a formula (condition) that describes T.
Returns all tuples T that satisfies a condition.
To express the query 'Find the set of all tuples S such that F(S) is true,' we can write:
{S | F(S)}
Here, F is called a formula (well-formed formula, or wff in mathematical logic). For example, to
express the query 'Find the staffNo, fName, lName, position, sex, DOB, salary, and branchNo of
all staff earning more than 10,000', we can write:
Example:
- It implies that it selects the tuples from the TEACHER in such a way that the resulting teacher
tuples will have the salary greater than 20000. This is an example of selecting a range of values.
For example −
{ T.name | Author(T) AND T.article = 'database' }
Output − Returns tuples with 'name' from Author who has written article on 'database'.
TRC can be quantified. We can use Existential (∃) and Universal Quantifiers (∀).
Output − The above query will yield the same result as the previous one.
Domain Relational Calculus (DRC)
In DRC, the filtering variable uses the domain of attributes instead of entire tuple values (as done
in TRC, mentioned above).
Notation −
{ a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}
Where a1, a2 are attributes and P stands for formulae built by inner attributes.
Output − Yields Article, Page, and Subject from the relation Books, where subject is database.
Just like TRC, DRC can also be written using existential and universal quantifiers. DRC also
involves relational operators.
The expression power of Tuple Relation Calculus and Domain Relation Calculus is equivalent to
Relational Algebra.