DBMS- Unit-1
DBMS- Unit-1
Unit-1
Introduction to Database management Systems
Syllabus:
Introduction: Database system.
Characteristics (Database Vs File System)
Database Users.
Advantages of Database Systems.
Database applications.
Introduction of Data Models.
Concepts of Schema, Instance and data-independence -Three tier schema architecture for
data independence.
Database system structure, environment, Centralized and Client Server architecture for
the database.
Entity Relationship Model: Introduction, Representation of entities, attributes, entity set,
relationship, relationship set.
Specialization, Generalization using ER Diagrams.
Constraints, sub classes, super class, inheritance using ER Diagrams
Database:
Database is a collection of related data. Data is a collection of facts and figures that can be
processed to produce information. Mostly data represents recordable facts. Data aids in
producing information, which is based on facts. For example, if we have data about marks
obtained by all students, we can then make a report about toppers and average marks.
A database management system stores data in such a way that it becomes easier to retrieve,
manipulate, and produce information.
Characteristics of DBMS:
Traditionally, data was organized in file formats. DBMS was a new concept then, and all the
research was done to make it overcome the deficiencies in traditional style of data management.
different levels of security features, which enables multiple users to have different views with
different features. For example, a user in the Sales department cannot see the data that belongs to
the Purchase department. Additionally, it can also be managed how much data of the Sales
department should be displayed to the user. Since a DBMS is not saved on the disk as traditional
file systems, it is very hard for miscreants to break the code.
------------------------------------------------------
A typical DBMS has users with different rights and permissions basing on the purpose of using it
as per their need. Some users retrieve data and some back it up. The users of a DBMS can be
broadly categorized as follows –
4
Administrators − Administrators maintain the DBMS and are responsible for administrating
the database. They are responsible to look after its usage and by whom it should be used. They
create access profiles for users and apply limitations to maintain isolation and enforce security.
Administrators also look after DBMS resources like system license, required tools, and other
software and hardware related maintenance.
Designers − Designers are the group of people who actually work on the designing part of the
database. They keep a close watch on what data should be kept and in what format. They
identify and design the whole set of entities, relations, constraints, and views.
End Users − End users are those who actually enjoy the benefits of having a DBMS. End users
can range from simple viewers who pay attention to the self interested data or market rates to
sophisticated users such as business analysts.
------------------------------------------------------
If the architecture of DBMS is 2-tier, then it must have an application through which the DBMS
can be accessed. Programmers use 2-tier architecture where they access the DBMS by means of
an application. Here the application tier is entirely independent of the database in terms of
operation, design, and programming.
3-tier architecture:
A 3-tier architecture separates its tiers from each other based on the complexity of the users and
how they use the data present in the database. It is the most widely used architecture to design a
DBMS.
Database (Data) Tier − At this tier, the database resides along with its query processing
languages. We also have the relations that define the data and their constraints at this level.
Application (Middle) Tier − At this tier, the application server and the programs resides that
access the database. For a user, this application tier presents an abstracted view of the database.
End-users are unaware of any existence of the database beyond the application. At the other end,
the database tier is not aware of any other user beyond the application tier. Hence, the application
layer sits in the middle and acts as a mediator between the end-user and the database.
User (Presentation) Tier − End-users operate on this tier and they know nothing about any
existence of the database beyond this layer. At this layer, multiple views of the database can be
provided by the application. All views are generated by applications that reside in the application
tier.
Multiple-tier database architecture: is highly modifiable, as almost all its components are
independent and can be changed independently.
------------------------------------------------------
Topic-4: Advantages of DBMS
Database Management System (DBMS) is a collection of interrelated data and a set of software
tools/programs that access, process, and manipulate data. It allows access, retrieval, and use of
that data by considering appropriate security measures. The Database Management system
(DBMS) is really useful for better data integration and security.
The advantages of database management systems are:
1. Data Security: The more accessible and usable the database, the more it is prone to security
issues. As the number of users increases, the data transferring or data sharing rate will be
increased, this may lead to face the risk of data security more. A Database Management
System (DBMS) provides a better platform for data privacy and security policies thus,
helping companies to improve Data Security.
6
2. Data Integration: Due to the Database Management System, it is possible to have access to
well-managed and synchronized forms of data thus it makes data handling very easy and
gives an integrated view of how a particular organization is working and also helps to keep
track of how one segment of the company affects another segment.
3. Data Abstraction: The major purpose of a database system is to provide users with an
abstract view of the data in order to provide security to the data. There are different users at
different data abstraction levels to easily interact with the DBMS.
4. Reduction in data Redundancy: When working with a structured database, DBMS provides
the feature to prevent the input of duplicate items in the database. Example – Removal of
duplication of records in a table is possible through this.
5. Data sharing: A DBMS provides a platform for sharing data across multiple applications
and users, which can increase productivity and collaboration.
6. Data consistency and accuracy: DBMS ensures that data is consistent and accurate by
enforcing data integrity constraints and preventing data duplication. This helps to eliminate
data discrepancies and errors that can occur when data is stored and managed.
7. Data organization: A DBMS provides a systematic approach to organizing data in a
structured way, which makes it easier to retrieve and manage data efficiently.
8. Efficient data access and retrieval: DBMS allows for efficient data access and retrieval by
providing indexing and query optimization techniques that speed up data retrieval. This
reduces the time required to process large volumes of data and increases the overall
performance of the system.
9. Concurrency and maintained Atomicity: That means, if an operation is performed on one
particular table of the database, then the change must be reflected on the entire
database. The DBMS allows concurrent access to multiple users by using the
synchronization technique.
10. Scalability and flexibility: DBMS is highly scalable and can easily accommodate any
changes in data volumes and user requirements. DBMS can easily handle large volumes of
data, and can scale up or down depending on the needs of the organization.
DBMS offers numerous advantages, including data security, integrity, and reduced redundancy
-------------------------------------------------------
There are different areas where a database can be employed. Following are a few applications
that utilize the database information:
1. Banking: For customer information, accounts, and loans, and banking transactions.
2. Airlines: For reservations and schedule information. Airlines were among the first to use
databases in a geographically distributed manner - terminals situated around the world
access the central database system through phone lines and other data networks.
3. Universities: For student information, course registrations, and grades.
4. Credit card transactions: For purchases on credit cards and generation of monthly statements.
5. Telecommunication: For keeping records of calls made, generating monthly bills,
Maintenance of balances on prepaid calling cards, and storing information about the
7
communication networks.
6. Finance: For storing information about holdings, sales, and purchases of financial
instruments such as stocks and bonds.
7. Sales: For customer, product, and purchase information.
8. Manufacturing: For management of supply chain and for tracking production of items in
factories, inventories of items in warehouses / stores, and orders for items.
9. Human resources: For information about employees, salaries, payroll taxes and benefits, and
for generation of paychecks.
----------------------------------------------------------------
A Data Model in Database Management System (DBMS) is the concept of tools that are
developed to summarize the description of the database. Data Models provide us with a
transparent picture of data which helps us in creating an actual database. It shows us from the
design of the data to its proper implementation of data.
An entity is referred to as a real-world object. It can be a name, place, object, class, etc. These
are represented by a rectangle symbol in an ER Diagram.
An attribute can be defined as the description of the entity. These are represented by Ellipse
symbol in an ER Diagram. It can be Student-Id, Age, Roll Number, or Marks for a Student.
Relationships are used to define relations among different entities. Diamonds and Rhombus
symbols are used to show Relationships.
-------------------------------------------------------
Topic-7: Schema- Instance- Data Independence
A Database Schema is a logical representation of data that shows how the data in a database
should be stored logically. It shows how the data is organized and the relationship between the
tables. Database schema contains table, field, views and relation between different keys
like primary key, foreign key. Database schema provides the organization of data and the
relationship between the stored data. It also provides a set of guidelines that control the
database along with that it provides information about the way of accessing and modifying the
data. There are 3 kinds of schemas as shown below:
A Physical schema defines how the data or information is stored physically in the storage
systems in the form of files & indices. This is the actual code or syntax needed to create the
structure of a database, we can say that when we design a database at a physical level, it’s
called physical schema. The Database administrator chooses where and how to store the data in
the different blocks of storage.
A logical schema defines all the logical constraints that need to be applied to the stored data,
and also describes tables, views, entity relationships, and integrity constraints.
9
A View schema is a view level design which is able to define the interaction between end-user
and database. User is able to interact with the database with the help of the interface without
knowing much about the stored mechanism of data in database.
Instance:
An instance shows the data or information that is stored in the database at a specific point in
time. Instances are also called the current state or database state. The database schema is the
design that defines the variables in tables that belong to a particular database. There may be
many instances that correspond to a certain database schema. The new data items in a record
can be inserted, modified, or deleted at any time. So, according to this, we can say that the data
can change from one state to another.
(Instance - example)
The 5 rows in the above-provided table are called Instances because they provide the
information of the Database stored at the current point in time. So, on this basis, we can say
that the Instance gives the information of the Database at any point in time.
Data Independence:
Data independence is a property of a database management system by which we can change the
database schema at one level of the database system without changing the database schema at
the next higher level. An environment can be made available in which data is independent of all
programs, and through the three schema architectures, data independence will be very clear to
understand.
There are two types of data independence.
Logical data independence
Physical data independence
If we change the storage size of the database system server, it will not affect the conceptual
structure of the database.
It is used to keep the conceptual level separate from the internal level.
Example – Changing the location of the database from C drive to D drive will not affect the
conceptual level.
----------------------------------------------------------------
11
Structure of DBMS:
The DBMS accepts SQL commands generated from a variety of user interfaces, produces query
evaluation plans, executes these plans against the database, and returns the answers. SQL
commands can also be embedded in host language application programs, e.g., Java or Python
programs in order to access data from database.
When a user issues a query, the parsed query is presented to a query optimizer, which uses
information about how the data is stored to produce an efficient execution plan for evaluating the
query. An execution plan is a blueprint for evaluating a query, and is usually represented as a
tree of relational operators.
The code that implements relational operators sits on top of the File and access methods layer.
This layer includes a variety of software for supporting the concept of a file, which, in a DBMS,
is a collection of pages or a collection of records. This layer typically supports a heap file, or file
of unordered pages, as well as indexes. In addition to keeping track of the pages in a file, this
layer organizes the information within a page.
12
The Files and access methods layer code sits on top of the buffer manager, which brings
pages in from disk to main memory as needed in response to read requests. The lowest layer of
the DBMS software deals with management of space on disk, where the data is stored. Higher
layers allocate, de-allocate, read, and write pages through (routines provided by) this layer,
called the disk space manager.
The DBMS supports concurrency and crash recovery by carefully scheduling user requests and
maintaining a log of all changes to the database. DBMS components associated with concurrency
control and recovery include the transaction manager, which ensures that transactions request
and release locks according to a suitable locking protocol and schedules the execution
transactions; the lock manager, which keeps track of requests for locks and grants locks on
database objects when they become available; and the recovery manager, which is responsible
for maintaining a log, and restoring the system to a consistent state after a crash. The disk space
manager, buffer manager, and File and access method layers must interact with these
components. Finally, the Database maintains the physical location of data where it employs a
system catalog, which intern contains both data files and index files.
Database Environment:
A database environment is a collective system of components that comprise and regulates the
group of data, management, and use of data, which consist of software, hardware, people,
techniques of handling database, and the data also.
Here, the hardware in a database environment means the computers and computer peripherals
that are being used to manage a database, and the software means the whole thing right from the
operating system (OS) to the application programs that include database management software
like M.S. Access or SQL Server. Again the people in a database environment include those
people who administrate and use the system. The techniques are the rules, concepts, and
instructions given to both the people and the software along with the data
Centralized Architecture:
A centralized architecture for DBMS is one in which all data is stored on a single server, and all
clients connect to that server in order to access and manipulate the data. This type of architecture
is also known as a monolithic architecture. One of the main advantages of a centralized
architecture is its simplicity - there is only one server to manage, and all clients use the same
data.
However, there are also some drawbacks to this type of architecture. One of the main downsides
is that, because all data is stored on a single server, that server can become a bottleneck as the
number of clients and/or the amount of data increases. Additionally, if the server goes down for
any reason, all clients lose access to the data.
13
Client-Server Architecture:
A client-server architecture for DBMS is one in which data is stored on a central server, but
clients connect to that server in order to access and manipulate the data.
One of the main benefits of a client-server architecture is that it is more scalable than a
centralized architecture. As the number of clients and/or the amount of data increases, the server
can be upgraded or additional servers can be added to handle the load. This allows the system to
continue functioning smoothly even as it grows in size.
----------------------------------------------------------------
14
ER model stands for an Entity-Relationship model. It is a high-level data model. This model is
used to define the data elements and relationship for a specified system. It develops a conceptual
design for the database. It also develops a very simple and easy to design view of data. In ER
modeling, the database structure is portrayed as a diagram called an entity-relationship diagram.
For example, design of a school database. In this database, the student will be an entity with
attributes like address, name, id, age, etc. The address can be another entity with attributes like
city, street name, pin code, etc and there will be a relationship between them.
Component of ER Diagram:
15
Entity: An entity may be any object, class, person or place. In the ER diagram, an entity can be
represented as rectangles. Consider an organization as an example- manager, product, employee,
department etc. can be taken as an entity.
An entity that depends on another entity called a weak entity. The weak entity doesn't contain
any key attribute of its own. The weak entity is represented by a double rectangle.
Attribute: The attribute is used to describe the property of an entity. Ellipse is used to represent
an attribute. For example: id, age, address, name, etc. can be attributes of a student which are
represented in the Entity diagram above.
Key Attribute: The key attribute is used to represent the main characteristics of an entity. It
represents a primary key. The key attribute is represented by an ellipse with the text underlined
as shown in the Entity diagram above.
Composite Attribute: An attribute that composed of many other attributes is known as a
composite attribute. The composite attribute is represented by an ellipse, and those ellipses are
connected with an ellipse.
16
Multi-valued Attribute: An attribute can have more than one value. These attributes are known
as a multi-valued attribute. The double oval shape is used to represent multi-valued attribute. For
example, a student can have more than one phone number.
Derived Attribute: An attribute that can be derived from other attribute is known as a derived
attribute. It can be represented by a dashed ellipse as shown in the Entity diagram above. For
example, A person's age changes over time and can be derived from another attribute like Date
of birth.
Relationship:
A relationship is used to describe the relation between entities. Diamond or rhombus is used to
represent the relationship.
One-to-One Relationship: When only one instance of an entity is associated with the
relationship, then it is known as one to one relationship. For example, A female can marry to
one male, and a male can marry to one female.
One-to-many relationship: When only one instance of the entity on the left, and more than one
instance of an entity on the right associates with the relationship then this is known as a one-to-
many relationship. For example, Scientist can invent many inventions, but the invention is done
by the only specific scientist.
17
Many-to-one relationship: When more than one instance of the entity on the left, and only one
instance of an entity on the right associates with the relationship then it is known as a many-to-
one relationship. For example, Student enrolls for only one course, but a course can have many
students.
Many-to-many relationship: When more than one instance of the entity on the left, and more
than one instance of an entity on the right associates with the relationship then it is known as a
many-to-many relationship. For example, Employee can assign by many projects and project
can have many employees.
Relationship set:
In a Relational database, relationship sets are built up by utilizing keys, such as primary and
foreign keys, to interface related records over distinctive tables. A relationship set denotes a set
of relationships of the same type. In other words, it demonstrates relationships between entities
in a database. A relationship is a single connection between entities. A relationship set is a
collection of similar relationships. Relationship sets help organize data in a database and allow
for more complex and structured representations of connections between entities.
In this example, teaches is a relationship-set which has 2 entity-sets associated with it. Hence,
the degree relationship-set in this case is 2. Sometimes, the degree would be 3 or more also.
Unary relationship-set is the one in which the relationship is associated with one entity. Ex:
18
Binary relationship-set is the one in which 2 entities are associated with the relationship-set Ex:
Teacher teaches Students as shown above.
Ternary relationship-set is the one in which 3 entities are associated with the relationship-set Ex:
N-aray relationship-set is the one in which has more than 3 entities are associated with the
relationship-set.
----------------------------------------------------------------
Topic-10: Generalization-Specialization-Aggregation
Generalization: It is the process of extracting common properties from a set of entities and
creating a generalized entity from it. It is a bottom-up approach in which two or more entities
can be generalized to a higher-level entity if they have some attributes in common. For
Example, STUDENT and FACULTY can be generalized to a higher-level entity called
PERSON as shown below:
19
Specialization:
In specialization, an entity is divided into sub-entities based on its characteristics. It is a top-
down approach where the higher-level entity is specialized into two or more lower-
level entities. For Example, an EMPLOYEE entity in an Employee management system can be
specialized into DEVELOPER, TESTER, etc. as shown
Aggregation:
Aggregation is an abstraction through which we can represent relationships as higher-level
entity sets. For Example, an Employee working on a project may require some machinery. So,
REQUIRE relationship is needed between the relationship WORKS_FOR and entity
MACHINERY. Using aggregation, WORKS_FOR relationship with its entities EMPLOYEE
and PROJECT is aggregated into a single entity and relationship REQUIRE is created between
the aggregated entity and MACHINERY.
---------------------------------------------------------
20
Enhanced ERDs are high-level models that represent the requirements and complexities of
complex databases. The EER model includes all modeling concepts of the ER model.
In addition, EER includes the following concepts:
Subclasses and Super classes
Specialization and Generalization
Category or Union type
Attribute and relationship inheritance
Constraints
There are two types of constraints on the “Sub-class” relationship.
Total or Partial – A sub-classing relationship is total if every super-class entity is to be
associated with some sub-class entity, otherwise partial. Sub-class “job type based
employee category” is partial sub-classing – not necessary every employee is one of
(secretary / engineer / and technician), i.e. union of these three types is a proper subset of
all employees. Whereas other sub-classing “Salaried Employee AND Hourly Employee” is
total; the union of entities from sub-classes is equal to the total employee set, i.e. every
employee necessarily has to be one of them.
Overlapped or Disjoint – If an entity from a super-set can be related (can occur) in
multiple sub-class sets, then it is overlapped sub-classing, otherwise disjoint. Both the
examples: job-type based and salaries/hourly employee sub-classing are disjoint.
Attribute Inheritance: EER model allows attributes to be inherited from a superclass to its
subclasses. This means that attributes defined in the super class are automatically inherited by
all its subclasses.
Subtypes and Supertypes: The EER model allows for the creation of subtypes and super
types. A super type is a generalization of one or more subtypes, while a subtype is a
specialization of a super type. For example, a vehicle could be a super type, while car, truck,
and motorcycle could be subtypes.
Constraints: The EER model allows for the specification of constraints that must be satisfied
by entities and relationships. Examples of constraints include cardinality constraints, which
specify the number of relationships that can exist between entities, and participation
constraints, which specify whether an entity is required to participate in a relationship.
21
Example
Teaching Assistant can subclass of Employee and Student both. A faculty in a university
system can be a subclass of Employee and Alumnus.
Union
Set of Library Members is UNION of Faculty, Student, and Staff. A union relationship
indicates either type; for example, a library member is either Faculty or Staff or Student.
Below are two examples that show how UNION can be depicted in ERD – Vehicle Owner
is UNION of PERSON and Company, and RTO Registered Vehicle is UNION of Car and
Truck.
---------------------------------------------------------------