Chapter 1 - Module 1
Chapter 1 - Module 1
UNIVERSITY
• DEPARTMENT OF CSE
MODULE 1 8 Hrs
Introduction:
Purpose of Database System-–Views of data–data
models, database management system, three-
schema architecture of DBMS, components of
DBMS. E/R Model - Conceptual data modeling -
motivation, entities, entity types, attributes
relationships, relationship types, E/R diagram
notation, examples.
INTRODUCTION
TO
DBMS
DATABASE MANAGEMENT SYSTEM
DBMS is a collection of data and management systems that are
used for creating and managing databases.
File System manages data using files in hard disk. Users are allowed to create,
delete, and update the files according to their requirement.
Some of the data is common for all sections like Roll No, Name, Address and
Phone number of students but some data is available to a particular section
only like Hostel allotment number which is a part of hostel office.
Drawbacks with File System:
3. Difficult Data Access: A user should know the exact location of file to access
data, so the process is very cumbersome and tedious. If user wants to search student
hostel allotment number of a student from 10000 unsorted students’ records, how
difficult it can be.
4. Unauthorized Access: File System may lead to unauthorized access to data.
If a student gets access to file having his marks, he can change it in
unauthorized way.
6. No Backup and Recovery: File system does not incorporate any backup
and recovery of data if a file is lost or corrupted.
• Abstraction is one of the main features of database systems.
Hiding irrelevant details from user and providing abstract view of
data to users, helps in easy and efficient user-
database interaction.
• Data abstraction is hiding the complex data structure in order
to simplify the user’s interface of the system.
• It is done because many of the users interacting with the database
system are not that much computer trained to understand the
complex data structures of the database system.
View of data in DBMS narrate how the data is visualized at each level of data
abstraction?
• View level: provides the “view of data” to the users and hides
the irrelevant details such as data relationship, database
schema, constraints, security etc from the user.
• .
Consider an Example of a University Database.
At the different levels this is how the implementation will look like:
•Bank Entity: Attributes of Bank Entity are Bank Name, Code and Address.
Code is Primary Key for Bank Entity.
•Customer Entity : Attributes of Customer Entity are Customer_id, Name, Phone Number and
Address.
Customer_id is Primary Key for Customer Entity.
•Branch Entity : Attributes of Branch Entity are Branch_id, Name and Address.
Branch_id is Primary Key for Branch Entity.
•Loan Entity : Attributes of Loan Entity are Loan_id, Loan_Type and Amount.
Loan_id is Primary Key for Loan Entity.
Relationships are :
•Bank has Branches => 1 : N
One Bank can have many Branches but one Branch can not belong to many Banks, so the
relationship between Bank and Branch is one to many relationship.
At view level, user just interact with system with the help of GUI
and enter the details at the screen, they are not aware of how the
data is stored and what data is stored; such details are hidden from
them.
Example:
We have to create a database of a college. Now, what entity sets
would be involved? Student, Lecturer, Department, Course and
so on…
Now, the entity sets Student, Lecturer, Department, Course will be
stored in the storage as the consecutive blocks of the memory
location. This is the physical or internal level and is hidden from
the programmers but the database administrator is it aware of it.
At the logical level, the programmers define the entity sets and
relationship among these entity sets using a programming language
like SQL. So, the programmers work at the logical level and even
the database administrator also operates at this level.
At the view level, the users have the set of applications which they
use to retrieve the data they are interested in.
THREE SCHEMA ARCHITECTURE
• The three schema architecture is also called ANSI/SPARC
(American National Standards Institute, Standards Planning And
Requirements Committee)architecture or three-level
architecture.
• Instance –
The three-schema architecture defines the
view of data at three levels:
✔Physical level (internal level)
✔ Logical level (conceptual level)
✔ View level (external level)
1. Physical level (internal level)
• The physical or the internal level schema describes how the data
is stored in the hardware.
• It also describes how the data can be accessed.
• The physical level shows the data abstraction at the lowest level
and it has complex data structures.
• Only the database administrator operates at this level.
• The internal schema defines the physical storage structure of the
database.
• The internal schema is a very low-level representation of the
entire database.
• It contains multiple occurrences of multiple types of internal
record. In the ANSI term, it is also called "stored record'.
2. Logical Level/ Conceptual Level:
•The user need not to deal directly with physical database storage
detail.
Data independence defines the extent to which the data schema can
be changed at one level without modifying the data schema at the
next level. Data independence can be classified as shown below:
• Now, a question arises what is the need to change the data schema at a
logical or conceptual level?
Well, the changes to data schema at the logical level are made either
to enlarge or reduce the database by adding or deleting more entities, entity sets, or
changing the constraints on data.
• Due to Logical independence, any of the below change will not affect the
external layer.
1.Add/Modify/Delete a new attribute, entity or relationship is possible without a
rewrite of existing application programs
2.Merging two records into one
3.Breaking an existing record into two or more records
2. Physical Data Independence:
Physical data independence defines the extent up to which the data schema can be
changed at the physical or internal level without modifying the data schema at
logical and view level.
Well, the physical schema is changed if we add additional storage to the system or
we reorganize some files to enhance the retrieval speed of the records.
Due to Physical independence, any of the below change will not affect the
conceptual layer.
•Using a new storage device like Hard Drive or Magnetic Tapes
•Modifying the file organization technique in the Database
•Switching to different data structures.
•Changing the access method.
•Modifying indexes.
•Changes to compression techniques or hashing algorithms.
•Change of Location of Database from say C drive to D Drive
• Mapping is used to transform the request and
response between various database levels of
architecture.
• Mapping is not good for small DBMS because it
takes more time.
• In External / Conceptual mapping, it is
necessary to transform the request from external
level to conceptual schema.
• In Conceptual / Internal mapping, DBMS
transform the request from the conceptual to
internal level.
Data Model
• Data Model is the modeling of the data description, data semantics, and
consistency constraints of the data.
• It provides the conceptual tools for describing the design of a database at each
level of data abstraction.
• Therefore, there are following data models used for understanding the structure
of the database:
1. Relational Data Model:
• This type of model designs the data in the form of rows and columns within a table.
• Thus, a relational model uses tables for representing data and in-between relationships.
Tables are also called relations.
• The relational data model is the widely used model which is primarily used by
commercial data processing applications.
2. Entity-Relationship Data Model:
• An ER model is the logical representation of data as objects and
relationships among them.
• These objects are known as entities, and relationship is an association among
these entities.
• It was widely used in database designing. A set of attributes describe the
entities.
For example, student_name, student_id describes the 'student' entity. A set of
the same type of entities is known as an 'Entity set', and the set of
the same type of relationships is known as ‘relationship set’.
3. Object-based Data Model:
• An extension of the ER model with notions of functions, encapsulation,
and object identity, as well.
• This model supports a rich type system that includes structured and
collection types. Here, the objects are nothing but the data carrying its
properties.
4. Semi structured Data Model:
• This type of data model is different from the other three data models.
• The semi structured data model allows the data specifications at places
where the individual data items of the same type may have different
attributes sets.
• The Extensible Markup Language, also known as XML, is widely used for
representing the semi structured data.
1. DDL stands for Data Definition Language.
• It is used to define database structure or pattern.
•It is used to create schema, tables, indexes, constraints, etc. in the
database.
•Using the DDL statements, you can create the skeleton of the
database.
•Data definition language is used to store the information of metadata
like the number of tables and schemas, their names, indexes, columns
in each table, constraints, etc.
Here are some tasks that come under DDL:
•Create: It is used to create objects in the database.
•Alter: It is used to alter the structure of the database.
•Drop: It is used to delete objects from the database.
•Truncate: It is used to remove all records from a table.
•Rename: It is used to rename an object.
•Comment: It is used to comment on the data dictionary.
2. DML stands for Data Manipulation Language.
It is used for accessing and manipulating data in a database. It
handles user requests.
Here are some tasks that come under DML:
•Select: It is used to retrieve data from a database.
•Insert: It is used to insert data into a table.
•Update: It is used to update existing data within a table.
•Delete: It is used to delete all records from a table.
•Merge: It performs UPSERT operation, i.e., insert or update
operations.
•Call: It is used to call a structured query language or a Java
subprogram.
•Explain Plan: It has the parameter of explaining data.
•Lock Table: It controls concurrency.
3. DCL stands for Data Control Language.
It is used to retrieve the stored or saved data.
•The DCL execution is transactional. It also has rollback
parameters.
(But in Oracle database, the execution of data control
language does not have the feature of rolling back.)
Here are some tasks that come under DCL:
•Grant: It is used to give user access privileges to a
database.
•Revoke: It is used to take back permissions from the
user.
4. Transaction Control Language
TCL is used to run the changes made by the DML
statement. TCL can be grouped into a logical
transaction.
Here are some tasks that come under TCL:
•Commit: It is used to save the transaction on the
database.
•Rollback: It is used to restore the database to
original since the last Commit.
Components of DBMS
1. Software
•The main component of a Database management system is the
software. It is the set of programs which is used to manage the
database and to control the overall computerized database.
•The DBMS software provides an easy-to-use interface to store,
retrieve, and update data in the database.
•This software component is capable of understanding the Database
Access Language and converts it into actual database commands to
execute or run them on the database.
2. Hardware
•This component of DBMS consists of a set of physical electronic
devices such as computers, I/O channels, storage devices, etc that
create an interface between computers and the users.
•This DBMS component is used for keeping and storing the data in
the database.
3. Procedures
•Procedures refer to general rules and instructions that help to design
the database and to use a database management system.
• Procedures are used to setup and install a new database management
system (DBMS), to login and logout of DBMS software, to manage
DBMS or application programs, to take backup of the database, and to
change the structure of the database, etc.
4. Data
•It is the most important component of the database management
system.
•The main task of DBMS is to process the data. Here, databases are
defined, constructed, and then data is stored, retrieved, and updated to
and from the databases.
•The database contains both the metadata (description about data or
data about data) and the actual (or operational) data.
5. Users
•The users are the people who control and manage the
databases and perform different types of operations on
the databases in the database management system.
DATABASE USERS:
4. Sophisticated Users :
• Sophisticated users can be engineers, scientists, business analyst,
who are familiar with the database.
• They can develop their own data base applications according to
their requirement.
• They don’t write the program code but they interact the data base
by writing SQL queries directly through the query processor.
5.Data Base Designers :
Data Base Designers are the users who design the structure of data
base which includes tables, indexes, views, constraints, triggers,
stored procedures.
He/she controls what data must be stored and how the data items to
be related.
6.Application Program :
Application Program are the back end programmers who writes the
code for the application programs.
They are the computer professionals. These programs could be
written in Programming languages such as Visual Basic, Developer,
C, FORTRAN, COBOL etc
7. Casual Users / Temporary Users :
Casual Users are the users who occasionally use/access the
data base but each time when they access the data base they
require the new information, for example, Middle or higher
level manager.
DBMS COMPONENT MODULES
• The diagram here is divided into two halve.
• The top half of the diagram refers to the various users of the database
environment and their interfaces.
• The lower half demonstrates the internals of the DBMS responsible for
storage of data and processing of transaction.
• The database and the DBMS catalogue are usually stored on disk. Access to
the disk is principally controlled by operating system (OS).
• This includes disk input / Output. A higher point stored data manager module
of DBMS controls access to DBMS information that is stored on the disk.
• If we consider the top half of the diagram it shows interface to casual users,
DBA staff, application programmers and parametric users.
1.The query processor transforms user queries into a series of low level
instructions. It is used to interpret the online user’s query and convert it into
an efficient series of operations in a form capable of being sent to the run
time data manager for execution.
The query processor uses the data dictionary to find the structure of the
relevant portion of the database and uses this information in modifying the
query and preparing and optimal plan to access the database
2.Query optimizer: The query optimizer determine an optimal strategy for the
query execution.
3.DML Compiler –
It processes the DML statements into low level instruction (machine language),
so that they can be executed.
4.DDL Interpreter –
It processes the DDL statements into a set of table containing meta data (data
about data).
5.DDL compiler specified in the DDL, processes schema
definitions as well as stores the description of the schema in the
DBMS Catalogue.
The catalogue includes information such as names and sizes of the
sizes of the files and data types of data of data items.
Storage particulars of every file mapping information among
schemas as well as constraints.
6 Storage Manager :
Storage Manager is a program that provides an interface between the
data stored in the database and the queries received.
It maintains the consistency and integrity of the database by applying
the constraints and executes the DCL statements.
It is responsible for updating, storing, deleting, and retrieving data in
the database.
It contains the following components –
6.1 Authorization Manager –
It ensures role-based access control, i.e,. checks whether the
particular person is privileged to perform the requested operation
or not.
6.2 Integrity Manager –
It checks the integrity constraints when the database is modified.
6.3 Transaction Manager –
It controls concurrent access by performing the operations in a
scheduled way that it receives the transaction. Thus, it ensures that
the database remains in the consistent state before and after the
execution of a transaction.
7.3 Indices –
It provides faster retrieval of data item
8. Run Time Database Manager