DBMS PPT Unit I
DBMS PPT Unit I
03 04 05
RELATIONAL- TRANSACTIONS NoSQL Databases
DATABASE DESIGN
AND DATA STORAGE
Course Outcomes
After completion of the course, the students will be able to
CO1 - Explain the concepts of Database Management System and develop Entity Relationship model
and Relational Models for a given application (K2)
CO2 - Manipulate and build database queries using Structured Query Language and relational algebra (K3)
CO3 - Use data normalization principles to develop a normalized database for a given application. (K3)
CO5 - Apply tools like NoSQL, MongoDB, Cassandra on real time applications (K3)
Syllabus
Unit I INTRODUCTION
Database Systems– Data Models – Database System Architecture - Entity-Relationship Model - ER
Diagram- Extended ER Model –ER into Relational Model - Relational Model: Structure of Relational
Databases, Database Schema, Keys, Tables.
Unit IV TRANSACTIONS
Transaction concepts and states– Concurrent Execution - Serializability - Query Processing -
Concurrency Control: Lock based Protocol - Timestamp based Protocol - Recovery System: – Log-
Based Recovery – Shadow Paging.
● Example
○ NAME
○ MOBILE
NUMBER
○ OFFICE
○ EMAIL-ID
What is Data
• Data can be facts related to any object
• Raw facts can be processed by computing machine.
• Collection of facts from which conclusions may be drawn.
• Data can be represented in the form of:
• numbers and words which can be stored in computer’s language.
What is Information
• Systematic and meaningful form of data.
• Knowledge acquired through study or experience.
• Information helps human beings in their decision making.
Is a collection of inter-related data and set of
programs to access those data
Database ? efficiently.
Bank Database
Student Database
DBMS
• Facebook uses
• Hive - Data warehouse for Hadoop, supports tables and a variant
of SQL called hiveQL and
• Cassandra - Multi-dimensional, distributed key-value store for
Facebook's private messaging.
○ Data
○ data relationships
○ data semantics
○ consistency constraints.
○ Example:
● Components
● Boxes - which correspond to record types
● Lines - which correspond to links
● Each entity has only one parent but can have several
children.
● At the top hierarchy there is only one entity which is called
root.
Network Model
Database Applications
• Banking: all transactions
• Airlines: reservations, schedules
• Universities: registration, grades
• Sales: customers, products, purchases
• Online retailers: order tracking, customized recommendations
• Manufacturing: production, inventory, orders, supply chain
• Human resources: employee records, salaries, tax deductions
• Database Users
• Query Processor
• Storage Manager
• Disk Storage
Database-Users
Bank Tellers, Reservation Clerks
Naïve User Are unsophisticated users who interact with the
system by invoking one of the application programs
that have been written previously.
interact with the system without writing programs. In- stead, they
Sophisticated form their requests either using a database query language or by
User using tools such as data analysis software. Analysts who submit
queries to explore data in the database fall in this category.
• Schema definition
• Storage structure and access-method definition
• Schema and physical-organization modification
• Granting of authorization for data access
• Routine maintenance. Examples of the database administrator’s routine
maintenance activities are:
o Periodically backing up the database, either onto tapes or onto remote servers,
to prevent loss of data in case of disasters such as flooding.
o Ensuring that enough free disk space is available for normal operations, and
upgrading disk space as required.
o Monitoring jobs running on the database and ensuring that performance is not
degraded by very expensive tasks submitted by some users.
Query Processor
Query Processor
• It Includes
• DDL interpreter - interprets DDL statements and records the
definitions in the data dictionary.
• DML complier - translates DML statements in a query language
into an evaluation plan consisting of low-level instructions that the
query evaluation engine understands.
• File Manager, which manages the allocation of space on disk storage and
the data structures used to represent information stored on disk.
• Buffer manager, which is responsible for fetching data from disk storage
into main memory, and deciding what data to cache in main memory. It is
used to handle data sizes that are much larger than the size of main
memory.
Disk
Storage
Disk
Storage
The storage manager implements several data structures as part of the physical
system implementation.
Indices, which provide fast access to data items that hold particular values.
Entity Relationship (ER) Model - Basic Concepts
The ER model defines the conceptual view of a database. It works around real-world entities and the associations
among them. At view level, the ER model is considered a good option for designing databases.
Components of the ER Diagram
This model is based on three basic concepts:
○ Entities
○ Attributes
○ Relationships
Entity
• A real-world thing either living or non-living that is easily recognizable and non-recognizable. It is anything in
the enterprise that is to be represented in our database. It may be a physical thing or simply a fact about the
enterprise or an event that happens in the real world.
• An entity can be place, person, object, event or a concept, which stores data in the database. The characteristics
of entities are must have an attribute, and a unique key. Every entity is made up of some ‘attributes’ which
represent that entity.
Examples of entities:
• Person: Employee, Student, Patient
• Place: Store, Building
• Object: Machine, product, and Car
• Event: Sale, Registration, Renewal
• Concept: Account, Course
Entity set:
● Student
● An entity set is a group of similar kind of entities. It may contain entities with attribute sharing similar values. Entities
are represented by their properties, which also called attributes. All attributes have their separate values. For example, a
student entity may have a name, age, class, as attributes.
Attributes
• Entities are represented by means of their properties, called attributes. All attributes have values. For
example, a student entity may have name, class, and age as attributes.
• There exists a domain or range of values that can be assigned to attributes. For example, a student's
name cannot be a numeric value. It has to be alphabetic. A student's age cannot be negative, etc.
Types of Attributes
• Simple attribute − Simple attributes are atomic values, which cannot be divided further. For example, a
student's phone number is an atomic value of 10 digits.
• Composite attribute − Composite attributes are made of more than one simple attribute. For example, a
student's complete name may have first_name and last_name.
• Derived attribute − Derived attributes are the attributes that do not exist in the physical database, but
their values are derived from other attributes present in the database. For example, average_salary in a
department should not be saved directly in the database, instead it can be derived. For another example,
age can be derived from data_of_birth.
• Single-value attribute − Single-value attributes contain single value. For example −
Social_Security_Number.
• Multi-value attribute − Multi-value attributes may contain more than one values. For example, a
person can have more than one phone number, email_address, etc.
Relationship
• The association among entities is called a relationship. For example, an employee works at a department, a
student enrolls in a course. Here, Works_at and Enrolls are called relationships.
Relationship Set
• A set of relationships of similar type is called a relationship set. Like entities, a relationship too can have
attributes. These attributes are called descriptive attributes.
Degree of Relationship
• The number of participating entities in a relationship defines the degree of the relationship.
• Binary = degree 2
• Ternary = degree 3
• n-ary = degree
ER Diagrams Symbols & Notations
Following are the main components and its symbols in ER Diagrams:
• Rectangles: This Entity Relationship Diagram symbol represents entity types
• Ellipses : Symbol represent attributes
• Diamonds: This symbol represents relationship types
• Lines: It links attributes to entity types and entity types with other relationship types
• Primary key: attributes are underlined
• Double Ellipses: Represent multi-valued attributes
ER Diagrams
● Entity
● Attributes
Mapping Constraints
• A mapping constraint is a data constraint that expresses the number of entities to which another entity can be related via a
relationship set.
• It is most useful in describing the relationship sets that involve more than two entity sets.
• For binary relationship set R on an entity set A and B, there are four possible mapping cardinalities. These are as follows:
○ One to one (1:1)
○ One to many (1:M)
○ Many to one (M:1)
○ Many to many (M:M)
One-to-one
• In one-to-one mapping, an entity in E1 is associated with at most one entity in E2, and an entity in E2 is associated with at
most one entity in E1.
• Example: One student can register for numerous courses. However, all those courses have a single line back to that one
student.
One-to-many
• In one-to-many mapping, an entity in E1 is associated with any number of entities in E2, and an entity in E2 is associated
with at most one entity in E1.
• For example, one class is consisting of multiple students.
Many-to-one
• In one-to-many mapping, an entity in E1 is associated with at most one entity in E2, and an entity in E2 is associated with
any number of entities in E1.
• For example, many students belong to the same class.
Many-to-many
• In many-to-many mapping, an entity in E1 is associated with any number of entities in E2, and an entity in E2 is associated
with any number of entities in E1.
• For example, Students as a group are associated with multiple faculty members, and faculty members can be associated
with multiple students.
Participation Constraints
• Total Participation − Each entity is involved in the relationship. Total participation is represented by double
lines.
• Partial participation − Not all entities are involved in the relationship. Partial participation is represented by
single lines.
Extended ER Features
• The basic E-R concepts can model most database features, some aspects of a database may be more aptly expressed by certain
extensions to the basic E-R model. The extended E-R features are
• Specialization
• Generalization
• Attribute Inheritance
• Constraints on Generalizations
• Condition-defined
• User-defined
• Disjoint
• Overlapping
• Aggregation
● Specialization – The process of designating to sub
grouping within an entity set is called specialization. For
example, the “person” is distinguish in to whether they are
“employee” or “customer”.
● Formally specialization is depicted by a triangle component
labelled (is a), means the customer is a person.
● Sometime this ISA (is a) referred as a superclass-subclass
relationship. This is also used to emphasize on to creating
the distinct lower level entity sets.
● Generalization – generalization is relationship that exist
between higher level entity set and one or more lower level
entity sets. Generalization synthesizes these entity sets into
single entity set.
● Higher level and lower level entity sets – This property is
created by specialization and generalization. The attributes
of higher level entity sets are inherited by lower level entity
sets.
● For example: “customers” and “employee” inherits the
attributes of “person”.
Attribute inheritance
• A crucial property of the higher- and lower-level entities created by specialization and generalization is
attribute inheritance. The attributes of the higher-level entity sets are said to be inherited by the lower-level
entity sets.
• For example, student and employee inherit the attributes of person. Thus, student is described by its ID,
name, and address attributes, and additionally a tot cred attribute; employee is described by its ID, name,
and address attributes, and additionally a salary attribute
Constraints on Generalizations
• One type of constraint involves determining which entities can be members of a given lower-
level entity set. Such membership may be one of the following:
Condition-defined:
• In condition-defined lower-level entity sets, membership is evaluated on the basis of whether or not an
entity satisfies an explicit condition or predicate.
• For example, assume that the higher-level entity set student has the attribute student type. All student
entities are evaluated on the defining student type attribute. Only those entities that satisfy the condition
student type = “graduate” are allowed to belong to the lower-level entity set graduate student. All entities
that satisfy the condition student type = “undergraduate” are included in undergraduate student.
● User-defined. User-defined lower-level entity sets are not constrained by a membership condition; rather, the
database user assigns entities to a given entity set.
● For instance, let us assume that, after 3 months of employment, university employees are assigned to one of four
work teams. We therefore represent the teams as four lower-level entity sets of the higher-level employee entity set.
A given employee is not assigned to a specific team entity automatically on the basis of an explicit defining
condition. Instead, the user in charge of this decision makes the team assignment on an individual basis.
● A second type of constraint relates to whether or not entities may belong to more than one lower-level entity set
within a single generalization. The lower level entity sets may be one of the following:
● Disjoint. A disjointness constraint requires that an entity belong to no more than one lower-level entity set. In our
example, student entity can satisfy only one condition for the student type attribute; an entity can be either a
graduate student or an undergraduate student, but cannot be both.
● Overlapping. In overlapping generalizations, the same entity may belong to more than one lower-level entity set
within a single generalization. For an illustration, consider the employee work-team example, and assume that
certain employees participate in more than one work team. A given employee may therefore appear in more than
one of the team entity sets that are lower level entity sets of employee. Thus, the generalization is overlapping
• A final constraint, the completeness constraint on a generalization or specialization, specifies
whether or not an entity in the higher-level entity set must belong to at least one of the lower-level
entity sets within the generalization/specialization. This constraint may be one of the following:
• Total generalization or specialization: Each higher-level entity must belong to a lower-level entity
set.
• Partial generalization or specialization: Some higher-level entities may not belong to any lower-
level entity set.
ER into Relational Model (Reduction of ER diagram to Table)
• The database can be represented using the notations, and these notations can be reduced to a collection of tables.
• In the database, every entity set or relationship set can be represented in tabular form.
There are some points for converting the ER diagram to the table:
• Entity type becomes a table.
• In the given ER diagram, LECTURE, STUDENT, SUBJECT and COURSE forms individual tables.
Domain Constraints
Domain constraints can be violated if an attribute value is not appearing in the corresponding domain or it is not of the
appropriate data type.
Domain constraints specify that within each tuple, and the value of each attribute must be unique.
This is specified as data types which include standard data types integers, real numbers, characters, Booleans, variable
length strings, etc.
Example:
Create DOMAIN CustomerName CHECK (value not NULL)
● Key Constraints
● An attribute that can uniquely identify a tuple in a relation is called the key of the table. The
value of the attribute for different tuples in the relation has to be unique.
● Example:
● In the given table, CustomerID is a key attribute of Customer Table. It is most likely to have a
single key for one customer, CustomerID =1 is only for the CustomerName =” Google”.
● The complexity & the size of the schema vary as per the size of the project. It helps developers to easily manage and structure
the database before coding it.
Types of Database Schema
The database schema is divided into three types, which are:
● Logical Schema
● Physical Schema
● View Schema
● Uniqueness: Keys ensure that each record in a table is unique, preventing duplicate entries.
● Integrity: They maintain the integrity of the database by establishing and enforcing
relationships between tables.
● Efficiency: Keys help in efficiently retrieving and updating records from the database.
● Referential Integrity: Foreign keys ensure that relationships between tables remain consistent
and that references between tables are valid.
Types of Keys in Database
● Management
Super Key - The set of attributes System
which can uniquely identify a tuple is known as Super Key. For
Example, STUD_NO, (STUD_NO, STUD_NAME) etc.
● Candidate Key - The minimal set of attribute which can uniquely identify a tuple is known as
candidate key. For Example, STUD_NO in STUDENT relation.
● Primary Key - There can be more than one candidate key in relation out of which one can be
chosen as the primary key. For Example, STUD_NO, as well as STUD_PHONE both, are
candidate keys for relation STUDENT but STUD_NO can be chosen as the primary key (only one
out of many candidate keys).
● Alternate Key - The candidate key other than the primary key is called an alternate key. For
Example, STUD_NO, as well as STUD_PHONE both, are candidate keys for relation STUDENT
but STUD_PHONE will be alternate key (only one out of many candidate keys).
● Foreign Key - is a column that creates a relationship between two tables. The purpose of Foreign
keys is to maintain data integrity and allow navigation between two different instances of an
entity.
● Composite key – Whenever a primary key consists of more than one attribute, it is known as a
composite key. This key is also known as Concatenated Key.
Candidate Key Example
Lets select the candidate keys from the above set of super
● Lets take an example of table “Employee”. This table
keys.
has three attributes: Emp_Id, Emp_Number &
Emp_Name.
1. {Emp_Id} – No redundant attributes
● Here Emp_Id & Emp_Number will be having unique 2. {Emp_Number} – No redundant attributes
values and Emp_Name can have duplicate values as 3. {Emp_Id, Emp_Number} – Redundant attribute. Either
more than one employees can have same name. of those attributes can be a minimal super key as both of
these columns have unique values.
● Emp_Id Emp_Number Emp_Name 4. {Emp_Id, Emp_Name} – Redundant attribute
● ------ ---------- -------- Emp_Name.
● E01 2264 Steve 5. {Emp_Id, Emp_Number, Emp_Name} – Redundant
● E22 2278 Ajeet attributes. Emp_Id or Emp_Number alone are sufficient
● E23 2288 Chaitanya enough to uniquely identify a row of Employee table.
● E45 2290 Robert 6. {Emp_Number, Emp_Name} – Redundant attribute
● How many super keys the above table can have? Emp_Name.
● 1. {Emp_Id}
● 2. {Emp_Number}
● 3. {Emp_Id, Emp_Number} The candidate keys we have selected are:
● 4. {Emp_Id, Emp_Name} {Emp_Id}
● 5. {Emp_Id, Emp_Number, Emp_Name} {Emp_Number}
● 6. {Emp_Number, Emp_Name}
Note: A primary key is selected from the set of candidate
keys. That means we can either have Emp_Id or
Emp_Number as primary key. The decision is made by
DBA (Database administrator)
Difference between Super Key and Candidate Key:
2. All super keys can’t be candidate keys. But all candidate keys are super keys.
Various super keys together makes the criteria to Various candidate keys together makes the criteria to select the
3.
select the candidate keys. primary keys.
In a relation, number of super keys are more than While in a relation, number of candidate keys are less than
4.
number of candidate keys. number of super keys.
5. Super key’s attributes can contain NULL values. Candidate key’s attributes can also contain NULL values.
END of UNIT I