DBMS NOTE by Aashish Maharjan
DBMS NOTE by Aashish Maharjan
Unit 1; Introduction;
# Data;
Data is a collection of row facts related with the elements,
objects or an entity.
It includes facts, figures, letters, words charts, symbols, audio,
video or any combinations.
This process is known as data processing.
The data collected from the user will be processed to generate
some information.
# Information:-
Information is the processed data that will give some useful facts
or meaning.
Information is useful in decision making process.
Information can be a data for next level of processing.
In this way a processed data is re-used to generate further
information which is known as information processing.
# Characteristics of information;
As mentioned in definition of information, it must give some
meaning, so information must have some characteristics, they are;
Subjectivity:
The information must be highly subjectively as there may be
verities of data collected.
The data of two person or object or elements may be matched
but the information must be specific to the concerned only.
Relevance:
The information must be relevant that is pertinent and
meaningful to the decision maker.
Right information for the right person will help in decision
making process.
Timeliness:
The information must be delivered on time as well.
Right information on right time will help in decision making.
Page 1 of 79
. Database Management System [DBMS];
Accuracy:
The information provided to make a decision must be accurate.
Wrong information may lead to wrong decision which will be
harmful for the organization.
Correct information format:
The information generated must be in proper format as desire.
Wrong format of right information may misguide the decision
making process.
Completeness:
The information provided to make a decision making process
must be complete.
Uncompleted information may guide you towards the wrong
decision.
Accessibility:
The information generated must be accessible for the decision
making process.
The information will be valueless if it is not accessible by the
decision maker in right format at the right time.
Summarizing;
Page 2 of 79
BBA VI Sem; [DBMS]; .
Page 3 of 79
. Database Management System [DBMS];
Data stored in a database are not flexible to link between two or
more tables, which will result incomplete data elements as per
required.
Data elements stored are not very secured as well as consistency,
which may raise problem in constringe handing.
Could not distribute the data elements to the multiple users as
required.
Most important disadvantage is the security, due to the
security provision it is not widely used in financial
transactions.
Page 4 of 79
BBA VI Sem; [DBMS]; .
# Database approach;
In a flat filing system, various types of problems and limitations
are there. These all types of problems can be solved very easily
using the database system. It will store all the data contains in the
Page 5 of 79
. Database Management System [DBMS];
centrally located storage section and accessed through different
applications. Some basic task such as Preparation, Insertion,
updating, retrieval, deletion, backups and restores etc works are
performed very efficiently. The data processing approach is
presented as below;
# Advantages of Modern Database Management System;
Data redundancy can be removed as several repeated data are
removed and established a link to access these data.
Data inconsistency will be avoided as the similar data elements
which may give different meaning will be collected separately.
Sharing of the data between the users is very easily possible.
I will maintain the standard format of the report as desire by the
user.
Different level of security level is maintained which will make a
database task highly secure even in network.
The data values collected in database is accurate and consistent.
(Integrity constringes are satisfied.)
It provides multiple users environment and interfaces for the
users.
Easy relationship among the data entity or objects within the
database or other databases.
Backup and recovery provision makes it highly secure as we can
recover the data loss from backups.
# Disadvantages of Modern Database Management System;
High investment required initially.
Overhead expenses for security, backup & recovery process.
Extra cost of hardware upgrading for extensive applications and
workspace for executing and storage.
Regular cost for maintenance of the software and hardware.
An additional cost for converting from traditional method to
integrated method.
Cost for backup and security.
The most important disadvantages is the cost factor as;
Page 6 of 79
BBA VI Sem; [DBMS]; .
Exercises;
1) What is Data and Information? List out the characteristics of
information;
2) What are the processes of converting the data into information?
3) What are the limitations of flat file processing system?
4) What is Database Management System?
5) What are the characteristics of DBMS?
6) What are the advantages of modern DBMS?
7) What are the limitations of DBMS?
Page 7 of 79
. Database Management System [DBMS];
Unit 2; Database system, Concept and Architecture;
# Database Architecture;
Centralizing DBMS Architecture;
In centralized DBMS architecture, data contains are stored in a
centrally located server and accessed by various dumb terminals
known as client.
These clients will access the database through the application
program.
Each client will get access permission from the DBA and have
to work accordingly.
Database structure of a single institution with multiple nodes
Magnetic
Storage
Section
Server Application
(Database Server)
Page 8 of 79
BBA VI Sem; [DBMS]; .
Database
Server
Application Server
Database
Database System
Storage
Application
Database
Page 9 of 79
System
User
. Database Management System [DBMS];
Data abstraction/ View of data;
A database system is a collection
of data elements in a set of files or
programs that allows the user to store,
access and modify. Major purpose of
the database is to give the access over
the data using an abstract view of over
the data. Since the entire database
users are not computer literate and they are not familiar with the
complexity of the database structure, the users are classified into
various stages, which is termed as level of the user. It is defined as
the abstract view of the data.
There are three levels of data abstraction;
1) Physical Level or Internal Level;
* Physical view is a representation of the entire database;
* It is expressed by internal schemas which contains the
definitions of stored records.
* It defines he methods of representing the data fields and the
access over the data.
* Only one physical view is defined per database.
2) Logical or Conceptual Level;
* Logical level expressed all the database entities and its
relationships.
* It is a representation of entire information contents of the
database in the physical storage.
* It includes the conceptual view of the data elements stored.
* Only one conceptual level is defined per database.
3) View Level or External Level;
* This level is closer with the user as it describes all the logical
record and the relationships.
* It contains the methods of deriving the objects in the database.
The object includes entities, attributes and relationships.
* External views can be defined as per required.
Page 10 of 79
BBA VI Sem; [DBMS]; .
Database Schema;
A Schema describes the view at each level. A Schema is an
outline or a plan that describes the records and relationships existing
in the view. The schema also describes the way in which entities at
one level of abstract can be mapped to the next level. The overall
design of the database is called database schema.
A database schema includes;
* Characteristics of data items such as entities and attributes.
* Logical structure and relationship among those data items.
* Format of storage representation.
* Integrity parameters such as physically authorization and backup
politics.
Since each view is defined as schemas, there are three levels of
schemas. At lower level one Physical Schema, at middle level one
Conceptual Schema and at higher level several Sub-Schemas.
Each user group refers to its own sub-schema. The DBMS
transforms a request specified on an external schema into conceptual
schema and then into that request forwarded into the internal schema
for processing over the database. The process of transforming the
request and results between the levels is called mapping.
Data independence;
* Data Independence is major objectives of the database system
which will be implemented through three levels of schema.
* The change made in one level does not affect the application or
definition at the other level is termed as data independence.
Logical data independence is the ability to modify the logical
schema without causing application program to be rewritten or any
change to the external schema. When data is added or removed, only
the view definition and the mapping need to be changed in the
DBMS that supports logical data independence.
Physical data independence is an ability to modify the internal
schema without causing any changes the external schema and
without causing application program to be re written.
Page 11 of 79
. Database Management System [DBMS];
# Data models;
* The physical or logical structure of the database is termed as
database model.
* It is a conceptual tool to describe the data, data relationships,
data semantics, and data constraints.
* Two major types of data models are used where as it is believed
that evolution of data models are still in progress.
# Object based data model;
* Object based concept is based on the data, data relationship.
* It is gaining wide acceptance for their flexibility structuring
capabilities.
* Various data integrity constraints can be specified explicitly by
using the objects based model.
Entity Relationship and Object Oriented Model are two database
model based on object.
Entity relationship model;
* The model was published by Peter Chen in 1976.
* It implies the link between object called Entity and Relationship
among these entities.
* Each entity has set of attributes that describes the object.
* A relationship is an association among the entities.
Page 12 of 79
BBA VI Sem; [DBMS]; .
Page 13 of 79
. Database Management System [DBMS];
Hierarchical model;
* The database model where data records are in relation with each
other in a hierarchal relationship.
* It was introduced by IBM through its Information Management
System.
* The record type of parents is belongs to the child record types in
this model.
* This model is not very flexible but very easy to maintain.
* It always follows One-to-N types of relationship.
Relational model;
* It was derived by E.F Codd in 1970 and was considered as
important concept in DBMS.
* It is based on the mathematical notations of a relation, consisting
of rows and column of data.
* The column names are termed as attributed (Fields) and rows are
treated as tuples (Record).
* For each relation there is a set of attributes that uniquely
determines the tuples, this is called key.
* It relieves the user from details of storage structure and access
methods.
* It is based on the relation of the database elements and popular
as RDBMS.
Page 14 of 79
BBA VI Sem; [DBMS]; .
Page 15 of 79
. Database Management System [DBMS];
Procedural DML and Non Procedural DML are two further
types of DML used in a database.
Procedural DML which requires user to specify what data are
needed and how to get these data.
Non Procedure DML which require a user to specify what data
are needed without specifying how to get it.
Page 16 of 79
BBA VI Sem; [DBMS]; .
Page 17 of 79
. Database Management System [DBMS];
DBA must specify such constraints explicitly. The integrity
constraints are kept in a special structure that is consulted by the
database system whenever an update takes place in the system.
Routine maintenance: The DBA is overall responsible for all
sort of database system, so the routine maintenance on the
database system is required. The routine maintenance includes:
* Periodically backup of the database to prevent data loss.
* Ensuring that enough free space is available for operation.
* Monitoring the job scheduling on the database system.
* Over-viewing the activities of the different user.
* Security provision against various problems that may arises.
# Database System Architecture (Environment);
Database system is portioned into modules that deal with each of the
responsibilities of the overall system. The functional components of a
database system can be broadly divided into storage manage and the
query processor components.
Page 18 of 79
BBA VI Sem; [DBMS]; .
System Structure;
# Storage Manager;
Storage manager is a program module that provides the interface
between the low level data stored in the database and the application
programs and queries submitted to the system. The storage manager is
responsible for the interaction with the file manager. The raw data
stored on the disk using the file system, which is usually provided by a
conventional OS. The storage manager translates the various DML
statements into low level file system commands. Storage manager is
responsible for storing, retrieving and updating data in the database. The
storage manager component includes;
* Authorization and Integrity Manager, which tests for the
satisfaction of integrity constraints and checks the authority of
user to access data.
Page 19 of 79
. Database Management System [DBMS];
* Transaction Manager, which ensure that the database remains
in a consistent (correct state) despite system failure and that
concurrent transaction executions proceed without conflicting.
* File Manager, which manages the allocation of space on disk
storage and the data structure used to represent information
stored on disk,
* Buffer Manager, which is responsible for fetching data storage
into main memory and deciding what data to cache in main
memory. The buffer manager is a critical part of the database
system since it enables the database to handle the data sizes that
are much larger than the size of main memory.
Query Processor;
The query processing components are;
* DDL Interpreter, which interprets DDL statements and records
the definitions in the data dictionary.
* DML Compiler, which translates DML statements in a query
language into an evaluation plan consisting of low level
instructions that query evaluating engine understand.
* Query Evaluation Engine, which executes low level
instructions generated by DML complier.
Page 20 of 79
BBA VI Sem; [DBMS]; .
Page 21 of 79
. Database Management System [DBMS];
The attributes, as used in ER model, can be characterized into the
following types.
Simple and Composite attributes;
The attributes that are divisible into sub parts (such as Name
can divided into First, Middle and Last Name) is termed as
composite attributes. A composite attributes are concatenation of
simple attributes. Most of the attributes are simple or atomic.
Single valued & Multi valued attributes;
The attributes that are having only one value is termed as
single value attributes. The attributes that have set of values for
the specific entry is termed as multi value attributes. Age is a
single value attributes where as phone no is multi value
attributes as one person can have more than one phone no.
Many to one;
Page 23 of 79
. Database Management System [DBMS];
An entity A is associated with at most one entity in B. An
entity in B, however can be associated with any number of
entities in entity A.
Many to many;
An entity A is associated with any number of entities in B
and an entity in B is associated with any number of entities in A.
Page 24 of 79
BBA VI Sem; [DBMS]; .
Page 25 of 79
. Database Management System [DBMS];
# The Entity Relationship Diagram Notations;
To construct an ER diagram different types of notations are used.
These are;
Cardinality Constraints;
One-to-One;
One-to-Many;
Many-to-One ;
Many to Many;
Page 26 of 79
BBA VI Sem; [DBMS]; .
Page 27 of 79
. Database Management System [DBMS];
Page 28 of 79
BBA VI Sem; [DBMS]; .
Exercises;
Page 29 of 79
. Database Management System [DBMS];
Unit 3; Filing and File Structure;
The physical or internal level of organization of a database system is
concerned with the efficient storage of information in the secondary
storage devices. At this level we are no longer concerned with the
application programmer’s views of the database. The physical to
conceptual level mapping must provide the necessary shield to the user.
The basic problem in physical database representation is to select a
suitable file system to store the desired information. The file consists of
record and a record may consist of several fields.
The typical operations that may be performed on the information
stored in the file are as follows.
Retrieve: To find a record or sets of records having a particular
value in a particular field or where he field values satisfy certain
conditions.
Insert: Insert a record or set of records at some specific locations.
Update: Modifies the field values of a record or sets of records.
Delete: Deletes a particular record or sets of records.
# Overview on Physical Storage Media;
Various types of physical storages devices are used in a database.
For the data storage purpose very huge storage devices are required.
These devices are varying in speed of access, cost, reliability etc. some
common physical storage devices used are;
Cache memory:
Cache is a fastest and most
expensive types of memory used in a
computer.
It will maintain the gaps between
processor and main memory.
While processing Cache act as a high speed buffer between
CPU and main memory.
It holds the very active and instantly used data and instructions
temporarily.
While processing, CPU will search the data contains first in
Cache and then in main memory.
Page 30 of 79
BBA VI Sem; [DBMS]; .
Main Memory;
The storage area, where data and instructions are stored while
working.
CPU will access the data and instructions stored.
Semiconductor memories and volatile in nature (Will hold
data contain till electricity is present.)
Generally small in size and expensive too.
RAM and ROM are used as main memory in computer.
Flash Memory;
A semi conductor type memory which is also called EEPROM.
The data stored in flash memories could survive power failure.
Accessing data in flash memory is little bit complicated.
(To read data in takes 10 nano sec and 4-10 micro sec to write.)
To overwrite the data, it erases the old data then writes new data.
It became very popular due to its small size and accessibility.
Magnetic storage memory;
It stores data in a form of magnetic fields on magnetic coated disk
platter.
It is termed as direct access storage as the data contains could be
read and written in any order.
It is non volatile in nature. (Data survives even in power failure.)
The storage capacity is very huge. (Depends upon no of elements
used to store the data contains.)
Mostly used as a mass storage devices for the server side storage.
Optical memory;
Optical disk used to read/write using optical rays or laser rays.
CD-ROM, DVD and Blu Rays Disk etc are common optical storage
device.
It is WORM (Write Once, Read Many) in nature.
Magnetic tapes;
A sequential accessing devices where data are stored using a tape
coated with magnetic oxide.
Page 31 of 79
. Database Management System [DBMS];
Cheaper and much slower accessing device used for backing-up
provision.
Used as protection from disk failure.
The hierarchies of the storage section are mentioned below;
Page 32 of 79
BBA VI Sem; [DBMS]; .
Random Access;
Data will be stored or accessed directly from the stored
location.
It is also termed as direct access method.
It is one of the fastest methods for data accessing.
Magnetic Disk, Core memories/ Flash memories are common
examples.
Different types of File Allocation Tables are used to allocate
the storage location even in this method (As similar in Index
sequential Method).
# File organizations;
A file is organized logically as a sequence of records. These records
are mapped onto disk blocks. Files are provided as a basic construct in
operating systems (OS), so we shall assume the existence of an
underlying file system. We need to consider ways of representing logical
data models in terms of files. Although, blocks are of a fixed size
determined by the physical properties of the disk and OS, record sizes
are varied. In RDBMS, tuples of distinct relations may be of different
sizes.
One approach to mapping database to files is to use several files and
to store records of only one fixed length in a given file. An alternative is
to structure files to accommodate variable length records (Fixed length
Page 33 of 79
. Database Management System [DBMS];
is easier to implement). Many of the techniques used for the former can
be applied to the variable length case. Thus we begin by considering a
file of fixed length records.
When a record is deleted, we could move the record that comes after
it into the space formerly occupied by the deleted record, and so on until
every record following the deleted record has been moved ahead. Such
an approach required moving large no of record. It might be easier
simple to move the final record of the file into the space occupied by the
deleted record. It is undesirable to move records to occupy the space
freed by a deleted record, since doing so requires additional block
accesses. Since insertions tend to be more frequent than deletions, it is
acceptable to leave open the space occupied by the deleted record and to
wait for a subsequent insertion before reusing the space. A simple
marker on a deleted record is not sufficient, since it is hard to find this
Page 34 of 79
BBA VI Sem; [DBMS]; .
Page 35 of 79
. Database Management System [DBMS];
account-info : array [1 ..8] of
record;
account-number : char(10);
balance : real;
end
end
We define account-info as an array with an arbitrary number of
elements. That is, the type definition does not limit the number of
elements in the array, although any actual record will have a specific
number of elements in its array. There is no limit on how large a record
can be (up to the size of the disk storage).
# Organization of Records in a File;
So far, we have studied how records are represented in a file
structure. An instance of a relation is a set of records. Given a set of
records, the next question is how to organize them in a file. Several of
the possible ways of organizing records in files are:
Heap File Organization:
Any record can be placed anywhere in the file where there is
space for the record. There is no ordering of records. Typically,
there is a single file for each relation.
Sequential File Organization:
Records are stored in sequential order, according to the value of a
“search key” of each record.
Hashing File Organization:
A hash function is computed on some attribute of each record.
The result of the hash function specifies in which block of the file
the record should be placed.
Generally, a separate file is used to store the records of each
relation. However, in a clustering file organization, records of several
different relations are stored in the same file; further, related records of
Page 36 of 79
BBA VI Sem; [DBMS]; .
the different relations are stored on the same block, so that one I/O
operation fetches related records from all the relations.
There are basically two methods of organizing records in a file.
# Sequential File Organization:
A sequential file is designed for efficient processing of records in
sorted order based on some search-key. A search key is any attribute or
set of attributes; it need not be the primary key, or even a super-key. To
permit fast retrieval of records in search-key order, we chain together
records by pointers. The pointer in each record points to the next record
in search-key order. Furthermore, to minimize the number of block
accesses in sequential file processing, we store records physically in
search-key order, or as close to search-key order as possible insertion or
deletion.
We can manage deletion by using pointer chains, as we saw
previously. For insertion, we apply the following rules:
1) Locate the record in the .le that comes before the record to be
inserted in search-key order.
2) If there is a free record (that is, space left after a deletion) within the
same block as this record, insert the new record there. Otherwise,
insert the new record in an overflow block. In either case, adjust the
pointers so as to chain together the records in search-key order.
# Clustering File Organization:
Many relational-database systems store each relation in a separate
file, so that they can take full advantage of the file system that the
operating system provides. Usually, tuples of a relation can be
represented as fixed-length records. Thus, relations can be mapped to a
simple file structure. This simple implementation of a relational
database system is well suited to low-cost database implementations as
in, for example, embedded systems or portable devices.
In such systems, the size of the database is small, so little is gained
from a sophisticated file structure. Furthermore, in such environments, it
Page 37 of 79
. Database Management System [DBMS];
is essential that the overall size of the object code for the database
system be small. A simple file structure reduces the amount of code
needed to implement the system. This simple approach to relational-
database implementation becomes less satisfactory as the size of the
database increases. We have seen that there are performance advantages
to be gained from careful assignment of records to blocks, and from
careful organization of the blocks themselves. Clearly, a more
complicated file structure may be beneficial, even if we retain the
strategy of storing each relation in a separate file.
Exercise:
1) What are the various operations performed with the information
stored in the file?
2) Define different types of physical storage media used to store
database. OR
Draw a hierarchical structure of the storage media and explain each
of them.
3) What are the data accessing methods followed to access the data
elements from the storage sections?
4) What is a File Organization? Explain them
5) How the files are organized in a file? Explain
6) What is sequential file organization? Explain
7) Explain about the cluster file organization.
Page 38 of 79
BBA VI Sem; [DBMS]; .
Database example;
Roll No Std_Name D_o_B Faculty Semester
BBA-00156-009 Rabindra 12/21/1989 Management VI Sem
BSc-00176-010 Laxmi 09/11/1990 Science IV Sem
BA-00454-010 Dil Bhadur 01/09/1991 Humanities III Sem
: : : : :
: : : : :
Page 39 of 79
. Database Management System [DBMS];
data in format, logical etc that we collect in an attributes is a
domain. Domain is a data type that we collect in an attributes.
Different attributes may have same domain in an entity.
Null Value;
The elements that we collect as an attributes cannot be blank or
empty which is termed as null value. We cannot keep the attributes
blank or null.
Tuple:
Every set of elements in a relation where relative data will be
collected is known as Tuple. It is a collection of data elements of an
object or person. Tuple is also known as Record. No of records is
also called cardinality of the relation. Each attributes of tuples may
have same value but two duplicate tuples are not allowed.
# Database Schema;
Database schema is a logical design of the database that we set
for extracting the data elements from the database.
Database instance is a snapshot of the data in the database at a
given instant in time.
Relation schema is a list of attributes in a specific order
(Logical design). It doesn’t contain any tuples.
The concept of relational schema corresponds to the
programming language notation of type definition. The concept of a
relation instance corresponds to the programming language to the
value of a variable. The value of the variable may change with time
similarly the contents of a relation instance may change with time as
the relation is updated.
STD_Schema = (Roll_No, Std_Name, D_o_B, Faculty, Semester)
We denote that the facts of students in a relation on STD_schema.
Similarly,
EXAM_schema=(Semester, Roll_No, Subject)
TEACHER_schema=(Teacher_Name, Course, Lecture_hrs)
Page 40 of 79
BBA VI Sem; [DBMS]; .
Constraints;
A Constraint is a rule that restricts the values that may be present
in the database. There are several constraints that are used to check
the validations of the data in a relational database. They are;
o Entity Integrity Constraints;
o Referential Integrity Constraints;
Page 41 of 79
. Database Management System [DBMS];
Referential Integrity;
Referential integrity is defined through foreign key field in a
database. Foreign key is a set of attributes that are linked with the
values with the primary key in a database. It will provide the
requested data elements in the relation that matches same value
with the primary key.
Key Fields;
We must have a way to specify the entities within the entity sets
distinguished. Two entities in an entity set are not allowed to have
exactly same values for all the attributes. A key allows us to identify
a set of attributes that suffice to distinguish relationships from each
other.
o Super Key is a set of one or more attributes that taken
collectively allow us to identify uniquely an entity in the entity
set. Soc_Sec_No, Symbol_No, Cust_No etc are some key fields
used as super key.
Page 42 of 79
BBA VI Sem; [DBMS]; .
Page 43 of 79
. Database Management System [DBMS];
powerful concept- the creation of new relations from old ones- makes
possible an infinite variety of data manipulation.
The select, project & rename operations are called unary operations,
because they operate on one relational. The other 3 operations operate
on pair of relations and are therefore binary operations.
The relation EMPLOYEE;
EN ENAME ADDRESS SALAR JOB_STATUS AG SpvN DEPTN
O Y E o O
101 Sagar Bhaktapur 3500 Research 28 102 10
fellow
107 Rashmi Kathmand 4200 Office Asst 24 102 20
u
102 Namrata Bhaktapur 5000 Secretary 23 112 30
112 Sachin Kathmand 9000 Administrator 25 107 20
u
109 Supraj Pokhara 3700 Office Asst 30 112 10
115 K. Singh Lalitpur 8500 Professor 45 113 20
111 Sarad Pokhara 12000 Director 35 115 40
113 Bidur Chitwan 4500 Research 28 115 40
fellow
116 Naren Biratnagar 9000 Professor 49 107 10
103 Pooja Pokhara 4500 Office Asst 24 107 Null
The relation PROJECT;
PNO PNAME S_DATE LOCATION PMGR DEPTN
O
M110 MARS 03-DEC-2007 Kathmandu 102 10
J220 JUPITOR 03-JAN-2008 Pokhara 107 30
V208 VENUS 15-OCT 2007 Hetauda 112 40
E005 EARTH 01-FEB 2008 Butwal 252 50
The relation WORK_IN;
ENO PNO P_JOB HOURS
101 M110 Cordinator 3
107 V208 Engineer 4
102 M110 Typist 5
112 E005 Accountant 9
107 M110 Scientist 3
112 V208 Coordinator 8
111 M110 Engineer 2
107 E005 Scientist 4
102 V208 Scientist 9
107 J220 Scientist 2
Page 44 of 79
BBA VI Sem; [DBMS]; .
All the tuples that matched the address Pokhara will be selected.
* Project Operation ( π );
Project is the operation of selecting certain attributes from a relation
to form a new relation that satisfies the given predicates. A symbol Pi
(π) is used to denote project operation. The desire columns are simply
specified by name during the projection. With the help of this operation,
any number of columns can be omitted from the table or column of
table can be rearranged that satisfies the given predicates. General
Syntax of projection is;
Page 45 of 79
. Database Management System [DBMS];
* Intersection Operation ( ∩ );
Intersection is the operation of selecting certain attributes from a
multiple relations that are common to both the relations. A symbol (∩)
is used to denote intersection operation. The relational intersection
operator also required two relations to be union compatible.
Page 46 of 79
BBA VI Sem; [DBMS]; .
those elements that are not common in compared relation. The relational
difference operator also required two relations to be union compatible.
* Cartesian Product ( X );
Cartesian product is the operation of selecting the attributes from a
multiple relations that may generate all possible combination of the
multiple relations. A symbol ( X ) is used to denote Cartesian product
operation. It will extract all the possible records similar as the
multiplication of the relations. The relational Cartesian product operator
also required two relations to be union compatible.
The Cartesian product of relations Employee ( E ) and Project ( P )
is denoted E X P, is the set of all product combinations of tuples of two
operation relations. Each resultant tuple consists of all the attributes of E
and P.
The Cartesian Product EMPLOYEE and PROJECT;
ENO ENAME ADDRESS …… DEPTNO PNO …… LOCATION
101 Sagar Bhaktapur 10 M110 Kathmandu
Page 47 of 79
. Database Management System [DBMS];
102 Sagar Bhaktapur 10 J220 Pokhara
103 Sagar Bhaktapur 10 V208 Hetauda
104 Sagar Bhaktapur 10 E005 Butwal
107 Rashmi Kathmandu 20 M110 Kathmandu
108 Rashmi Kathmandu 20 J220 Pokhara
109 Rashmi Kathmandu 20 V208 Hetauda
110 Rashmi Kathmandu 20 E005 Butwal
102 Namrata Bhaktapur 30 M110 Kathmandu
103 Namrata Bhaktapur 30 J220 Pokhara
104 Namrata Bhaktapur 30 V208 Hetauda
105 Namrata Bhaktapur 30 E005 Butwal
112 Sachin Kathmandu 20 M110 Kathmandu
113 Sachin Kathmandu 20 J220 Pokhara
114 Sachin Kathmandu 20 V208 Hetauda
115 Sachin Kathmandu 20 E005 Butwal
109 Supraj Pokhara 10 M110 Kathmandu
110 Supraj Pokhara 10 J220 Pokhara
111 Supraj Pokhara 10 V208 Hetauda
112 Supraj Pokhara 10 E005 Butwal
115 K. Singh Lalitpur 20 M110 Kathmandu
116 K. Singh Lalitpur 20 J220 Pokhara
117 K. Singh Lalitpur 20 V208 Hetauda
118 K. Singh Lalitpur 20 E005 Butwal
111 Sarad Pokhara 40 M110 Kathmandu
112 Sarad Pokhara 40 J220 Pokhara
113 Sarad Pokhara 40 V208 Hetauda
114 Sarad Pokhara 40 E005 Butwal
113 Bidur Chitwan 40 M110 Kathmandu
114 Bidur Chitwan 40 J220 Pokhara
115 Bidur Chitwan 40 V208 Hetauda
Page 48 of 79
BBA VI Sem; [DBMS]; .
* Join Operation ( )
Join is a binary operation that allows the user to combine two
relations in a specific way. Join of two relations is restriction of their
Cartesian product of R and N such that a specific condition meets. Join
is normally defined on an attribute from same domain. From R relation
a attribute (R.a) and from N relation b attribute (N.b) are joined with
specific condition (R.a N.b). The specified condition is called join
predicate of attributes.
Page 49 of 79
. Database Management System [DBMS];
In left outer join ( ), the resulting contains all the tuples of the
left operand and in right outer join ( ), the resulting contains all the
tuples of the right operand. Similarly full outer join ( ), the
resulting relation contains all the tuples from both operands.
* Rename operation ( ρ );
While working with multiple relations, two relations may have
common attributes. It is difficult to use similar relations with common
attributes, so we can change the name of such attributes such that two
relations have disjoint set of attributes. A rename operator is denoted
by Greek letter Rho ( ρ ). The expression is as follows;
ρ X (E)
In above example, the result of expression E under new name X;
ρ ( R (a1, a2, …. An), T )
The expression interprets the attributes of relation T is named as R
in new schema.
* Assignment Operation ( ← );
An assigning operator is used to assign the part of relational algebra
to a temporary variable. It is denoted by a left arrow (←) which works
like assignment in a programming language.
Temp1 ← (relational algebra)
The evaluation of an assignment does not result in any relation
being displayed to the user. The expression at the right of the arrow will
be assigned to the relational variable located at the left of the
expression. The result variable may be in subsequent expression.
* Division Operation ( ÷ );
A division operation consists of dividend and a divisor, in which the
dividend contains the divisor several times in the expression. In
relational algebra, the quotient is the answer of the division and we do
not bother about the reminders. Les see the expression R ÷ N;
R N R ÷ N
R. R.b R. N. N.b N.c _.a _.b _.c
a c a
Page 50 of 79
BBA VI Sem; [DBMS]; .
1 a x 1 a x 1 a x
1 b y 2 a x 1 b y
2 a x 3 a x 2 a x
2 b y 1 b x 2 b y
3 c x 2 b x
3 d y 3 b x
# Kinds of relation;
A relational algebra operation has been extended in several ways. A
simple extension is to allow arithmetic operations as part of projection.
An important extension is to allow aggregate operations such as
computing the sum of the elements of a set, or their average. Another
important extension is the outer join operation, which allow relation
algebra expressions to deal with null values, which model missing
information.
* Generalized Projection;
The generalized projection operation extends the projection
operation by allowing arithmetic functions to be used in the projection
list. The generalized projection operation has the form:
Π F1, F2, … …, Fn (E)
Where E is any arithmetic algebra expression, and each of F1, F2,
… …, Fn is an arithmetic expression involving constants and attributes
in the schema of E. As a special case, the arithmetic expression may be
simply an attributes or a constant.
Π ename, salary + 500 (Employee)
The attribute resulting from the expression salary + 500 does not
have a name. We can apply the rename operation to the result of
generalized projection in order to give it a name. As a notational
convenience, renaming of attributes can be combined with generalized
projection.
* Aggregate Function;
The aggregate functions take a collection of values and return a
single value as a result. For example, the aggregate functions sum takes
a collection of values and returns the sum of the values. Some of the
Page 51 of 79
. Database Management System [DBMS];
commonly used aggregate functions are SUM, AVG, COUNT, MAX,
MIN, DISTINCT etc.
The general form of the aggregate is as follows;
SUM (Salary (PAYROLL))
AVG (Salary (PAYROLL))
COUNT (NAME (EMPLOYEE))
MIN (Salary (PAYROLL))
MAX (Salary (PAYROLL))
The general form of the aggregate operation (Ʋ) is as follows;
G1, G2, … …, Gn Ʋ F1 (A1), F2 (A2), … … Fm(Am) (E)
Where E is any relation-algebra expression, G1, G2, … …Gn
constitute a list of attributes on which to group, each Fi is an aggregate
function, each Ai is an attribute name.
The tuple in the result of expression E are partitioned into group in
such a way that:
1) All tuples in a group have the same values for G1, G2, … … Gn.
2) Tuples in different groups have different values for G1, G2, …
… Gn.
In generalized projection, the result of an aggregatei0on operation
does not have name. We can apply rename operation to the result in
order to give it a name.
* Outer Join;
The outer join operation is an extension of the join operation to deal
with missing information. In both the “theta and natural join”, the tuples
that have similar values will not displayed, which may result the data
lost. To retain all the information of both relations, it is describe to have
a join which keeps the tuples having no corresponding values in both
relations associated with null values. This is the external join or outer
join. The outer join is extended into following;
Page 52 of 79
BBA VI Sem; [DBMS]; .
In left outer join ( ), the resulting contains all the tuples of the
left operand and in right outer join ( ), the resulting contains all
the tuples of the right operand. Similarly full outer join ( ), the
resulting relation contains all the tuples from both operands.
* Null Value;
Operations and comparisons of null values has to be avoided. Null
values mean values unknown or nonexistent. Any arithmetic operation
involving null values must return null value as result. Any comparison
with null value results in special value called unknown.
# Modification on database;
We express database modifications by using the assignment
operations. We make assignments to actual data relation by using same
notation as in assignment.
* Deletion;
Page 53 of 79
. Database Management System [DBMS];
We can delete only whole tuples, we cannot delete values on only
particular attributes, in relational algebra a deletion is expressed by
R←R–E
In above example R is a relation and E is a relational algebra query.
While resulting, an algebraic query E is been removed from R relation.
* Insertion;
To insert data into a relation, we either specify a tuple to be inserted
or write a query whose result (a set of tuples) will be inserted. Both the
attributes must belong to same domain. In relational algebra an insertion
operation is expressed by;
R←RUE
In above example R is a relation and E is a relational algebra query.
A resulting tuples from relational algebraic query E is been inserted into
relation R. Both the attributes of the relation and algebraic query must
belong to same domain.
* Updating;
In a relation, when we have to change a value of a particular
attributes, we have to update the tuple or attributes. We can use
generalized projection operator to do this task. Various predicates
(conditions) can be applied while updating the values of the tuples.
Temp1 ← Π ename, eadd, salary + salary * 0.10, Job_Status
(σ dept = Account (employee))
In above example salary will be increased by 10% of all the
employees whose dept is Account.
Temp2 ← Π ename, eadd, salary + salary * 0.10, Job_Status
(σ dept = Account (employee)) U
Π ename, eadd, salary + salary * 0.05, Job_Status
(σ Age >= 45 (employee))
In above example the salary will be increased by 10% if dept is
account and salary increased by 5% those whose age is greater than 45.
# Views;
Page 54 of 79
BBA VI Sem; [DBMS]; .
Exercise
1) What is relation? Explain with the components related with the
relation?
2) What is database schema? Differentiate between relation schema
and relation instance.
3) What is constraint? Differentiate between Entity Integrity
Constraints and Referential Integrity Constraints.
4) What is key field? Explain various types of keys used to in
relationship.
5) What is relational algebra? List out the operations used in a
relational algebra.
6) Give an expression in the relational algebra for the following
queries.
a. Select all names from the relation employee.
b. Select Name, Address, Phone, Dept from the relation Employee.
Page 55 of 79
. Database Management System [DBMS];
c. Select Name, Address, Phone, Dept who works in account
department.
d. Select Name, Address, Phone, Dept who works in account
department and earns more than 25000.
Page 56 of 79
BBA VI Sem; [DBMS]; .
# Introduction;
As a security is major features of the database, we must make sure
that the change done by the user is authorized and may not have any
types of error even accidently. To ensure the security an integrity
constraint is very much useful.
Integrity constraints provides a means of ensuring that changes
made to the database by authorized users do not result in a loss of data
consistency. Thus integrity constraints guards against accidental damage
of the database.
In general, an integrity constraint can be an arbitrary predicates
pertaining to the database. However arbitrary predicates may be costly
to test. Thus, we usually limit ourselves to integrity constraints that can
be tested with minimal overhead.
* Domain Constraints;
Domain constrain specifies that the values of each attributes
must be atomic value from the domain of the respective attributes. It
will make the distinct data collections in an attributes as well as
ensure the test query comparison makes sense. Data is said to be
domain integrity when the value of a column is derived from the
domain.
Domain constraints are the most elementary form of integrity
constraint. They are tested easily by the system wherever a new data
items is entered into the database. It is possible to have several
attributes with similar domain. For eg, the attributes Emp_Name,
Std_Name, Cust_Name might have same domain.
Page 57 of 79
. Database Management System [DBMS];
constraints Salary-null-test check (value not null)
* Referential Integrity;
In a relational model, data elements data are stored in different
relations and they are linked together while extracting accordingly. For
eg In a PERSONAL relation all the detail of the student will be stored,
and in a PARENTS relation detail of the parents are stored. If we have
to find the detail of the parents, we will link these two relations through
a unique and common attribute (PAR_ID) among them. The attribute
PAR_ID is set as a primary key in a PARENTS table where as it is
Page 58 of 79
BBA VI Sem; [DBMS]; .
Check Constraints;
Check constraint is used to verify the satisfaction of the value
entered. The value entered in an attributes must have a value within
the specified range or that satisfied the specific condition. The
syntax of check constraints is
Page 59 of 79
. Database Management System [DBMS];
[Constraint <name>] CHECK (<condition>)
For e.g:
CREATE TABLE employee
(………,
E_ID Integer CONSTRAINT chk_empid
CHECK (E_ID IS NOT NULL and E_ID<10000),
Dept Char(3) CONSTRAINT Chk_dept
CHECK (value in (“ACC”, “ADM”, “MAN”, “DIR”,
“TEC”, “HLP” )),
… … … );
Here in above example, an attributes will have a specific value
range less than 1000 and the field should consist null value. Other
attribute has a value of ACC, (Accountant), ADM (Administration),
MAN (Manager), DIR (Director), TEC (Technical) and HEP
(Helper). Beside these values other values are not allowed.
Assertions;
An assertion is a predicate expressing a condition that we wish
the database always to satisfy. Domain constrains and referential
integrity constraints are special forms of assertions. However, there
are many constraints that we cannot express by using only these
special forms. Some examples of such constraints are;
* Every loan has at least one customer who maintains an account
with a minimum balance of 1000.
* The sum of all loan amounts for each branch must be less than
the sum of all account balance at the branch.
An assertion in SQL takes the form
CREATE ASSERTION <assertion name> CHECK <predicate>
Triggers;
A trigger is a statement that the system executes automatically as
a side effect of a modification to the database. To design a trigger
mechanism, we must meet two requirements.
* Specify when a trigger is to be executed. This is broken up into
an event that cause the trigger to be checked and a condition that
must be satisfied for the trigger execution.
* Specify the action to be taken when the trigger executes.
Once we enter into a trigger into a database, the database system
takes on the responsibility of executing it whenever the specified
event occurs and the corresponding condition is satisfied.
Triggers are useful mechanism for altering humans or for
starting certain task automatically when certain conditions are met.
An example of use of trigger, suppose a warehouse wishes to
maintain a minimum inventory of each items; when an inventory
level of an item falls below the minimum level, an order should be
placed automatically. This is how the business rule can be
implemented by trigger. On an update of the inventory level of an
item, the trigger should compare the level with minimum inventory
level for the item and if the level is at or below the minimum, a new
order is added to an orders relation.
Page 61 of 79
. Database Management System [DBMS];
Database system:
Some database-system users may be authorized to access
only a limited portion of the database. Other users may be
allowed to issue queries, but may be forbidden to modify the
Page 62 of 79
BBA VI Sem; [DBMS]; .
Page 63 of 79
. Database Management System [DBMS];
electronic commerce. The bibliographic notes list textbook coverage of
the basic principles of network security. We shall present our discussion
of security in terms of the relational-data model, although the concepts
of the security are equally applicable to all data models.
Page 64 of 79
BBA VI Sem; [DBMS]; .
Page 65 of 79
. Database Management System [DBMS];
no authorization can be granted, the system will deny the view creation
request.
# Encryption Techniques
There are a vast number of techniques for the encryption of data.
Simple encryption techniques may not provide adequate security, since
it may be easy for an unauthorized user to break the code. As an
example of a weak encryption technique, consider the substitution of
each character with the next character in the alphabet. Thus,
Perryridge
becomes
Qfsszsjehf
If an unauthorized user sees only “Qfsszsjehf,” she probably has
insufficient information to break the code. However, if the intruder sees
a large number of encrypted branch names, she could use statistical data
regarding the relative frequency of characters to guess what substitution
is being made (for example, E is the most common letter in English text,
followed by T, A, O, N, I and so on).
A good encryption technique has the following properties:
It is relatively simple for authorized users to encrypt and decrypt
data.
It depends not on the secrecy of the algorithm, but rather on a
parameter of the algorithm called the encryption key.
Page 66 of 79
BBA VI Sem; [DBMS]; .
# Authentication;
Authentication refers to the task of verifying the identity of a
person/software connecting to a database. The simplest form of
authentication consists of a secret password which must be presented
when a connection is opened to a database.
Password-based authentication is used widely by operating systems
as well as databases. However, the use of passwords has some
drawbacks, especially over a network. If an eavesdropper is able to
“sniff” the data being sent over the network, she may be able to find the
password as it is being sent across the network. Once the eavesdropper
has a user name and password, she can connect to the database,
pretending to be the legitimate user.
Page 67 of 79
. Database Management System [DBMS];
security constraints. Individual implementations of SQL may differ in
details, or may support only a subset of the full language.
* IBM developed the original version of SQL at its San Jose Research
Laboratory.
* IBM implemented the language, originally called Sequel, as part of
the System R project in the early 1970s.
* Its name has changed to SQL (Structured Query Language) later and
has clearly established itself as the standard relational-database
language.
* In 1986, the American National Standards Institute (ANSI) and the
International Organization for Standardization (ISO) published an
SQL standard, called SQL-86.
* IBM published its own corporate SQL standard, the Systems
Application Architecture Database Interface (SAA-SQL) in 1987.
* ANSI published an extended standard for SQL, SQL-89, in 1989.
The next version of the standard was SQL-92 standard, and the most
recent version is SQL 1999.
* The SQL:1999 standard is a superset of the SQL-92 standard.
* Many database systems support some of the new constructs in
SQL:1999, although currently no database system supports all the
new constructs.
Page 69 of 79
. Database Management System [DBMS];
dept varchar2 (20),
proj_no number (4) NOT NULL,
b_sal number (10,2),
joindate date,
Primary Key (emp_no),
Check (e_add in (“Kathmandu”, “Lalitpur”,
“Bhaktapur”, “Kirtipur” ) ) );
);
Eg:- INSERT INTO employee
(emp_no, e_name, e_add, dept, proj_no, b_sal)
VALUES (101, ‘Ramesh’, “Kathmandu’, ‘Acc’,
5001, 12034);
# Arranging Tuples;
Page 71 of 79
. Database Management System [DBMS];
* ORDER BY (Arranging in Ascending or Descending Order);
SELECT emp_no, e_name, e_add FROM employee
ORDER BY b_sal;
SELECT emp_no, e_name, e_add FROM employee
ORDER BY b_sal (DESC);
Page 73 of 79
. Database Management System [DBMS];
UPDATE marks
SET (eng = 32)
WHERE (eng >= 27 and eng < 32);
# Set operation;
In SQL various types of SET operators are also used. Some of
them are;
* UNION operator;
Syntax:- <Query1> UNION <Query2>
Eg:- SELECT * FROM employee;
UNION
SELECT * FROM employee1;
* INTERSECT operator;
Syntax:- <Query1> INTERSECT <Query2>
Eg:- SELECT * FROM employee;
INTERSECT
SELECT * FROM employee1;
* MINUS operator;
Syntax:- <Query1> MINUS <Query2>
Eg:- SELECT * FROM employee;
MINUS
Page 75 of 79
. Database Management System [DBMS];
SELECT * FROM employee1;
# Sub Queries;
Syntax:- SELECT <col_name1, col_name2, … … …, col_nameX>
FROM <table_name>
WHERE (<SUB QUERY>);
Eg:- SELECT emp_no, e_name, e_add FROM employee;
WHERE
(SELECT * FROM employee
WHERE DEPT= “Kathmandu”);
SELECT emp_no, e_name, e_add FROM employee;
WHERE
(SELECT * FROM employee
WHERE DEPT=
(SELECT * FROM employee
WHERE (emp_id<110)));
Page 76 of 79
BBA VI Sem; [DBMS]; .
# Equivalence of expressions;
# Query optimization;
# Query decomposition;
Page 77 of 79
. Database Management System [DBMS];
Unit 8; Object Oriented Model;
# Introduction;
# Design of object oriented model;
Page 78 of 79
BBA VI Sem; [DBMS]; .
Page 79 of 79