0% found this document useful (0 votes)
5 views43 pages

Unit-1 DBMS

Uploaded by

heykrishnaa1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views43 pages

Unit-1 DBMS

Uploaded by

heykrishnaa1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Unit I: Introduction to DBMS

What is Data?
Data is defined as unorganized and unrefined facts or figures, or information that's stored
in orused by a computer.
Data – a collection of facts (numbers, words, measurements, observations, images, etc.) that
hasbeen translated into a form that computers can process.
Example: Name, age, height, weight, numbers, measurements etc.
What is Information?
When data are processed, organized, structured, and interpreted in a given context, so as to
makethem useful and meaningful, they are called information.
Example: Name – Ankit, City – Delhi, Class – 12, Marks – 80.

What is Database?
A database is an organized collection of inter-related data, which helps in insertion, deletion,
and retrieval of data efficiently. The database is also used to organize the data or information
in the form of tables, views, schemas, reports, etc.
Note: Using the database, you can easily access, update, and delete any information.
A database is an organized collection of data, so that it can be easily accessed and managed. DBMS
stands for Database Management System. We can break it like this DBMS = Database +
Management System. Database is a collection of data and Management System is a set
ofprograms to store and retrieve those data. Based on this we can define DBMS like this:
DBMS isa collection of inter-related data and set of programs to store & access those data in
an easy andeffective manner. What is the need of DBMS?
Database systems are basically developed for large amount of data. When dealing with huge
amount of data, there are two things that require optimization: Storage of data and retrieval of
data.
Database Management System (DBMS)
A database management system or DBMS is software used for creating and managing the
data in the database easily and effectively. It is basically a set of programs that allow
users to store, modify/update, and retrieve information from the database as per the
requirements. DBMS also provides security and protection to the database. DBMS acts as a
middle layer between the database and the user.
Example: MySQL, MS SQL Server, Oracle, SQL, DB2, Microsoft Access, etc. are
differenttypes of database management system.

What is Database Management System?


 The Database Management System or DBMS is an effective or easy way to store the
data,mainly when data maintenance and security are the primary concern of the user.
 The database management system stores the data or information in the form of
interrelatedtables and files.
 In this type of system, data security is maximized using encryption/decryption,
passwordprotection, granting authorized access and others.
 DBMS helps users to easily retrieve, insert, and manipulate data in a database.
 It also helps to perform data recovery, transactions, and many more.
Handling DBMS is difficult than the file system, but it provides more advantages than a
file system.
Functions of a DBMS
So, what does a DBMS really do? It organizes your files to give you more control over your
data. A DBMS makes it possible for users to create, edit and update data in database files.
Once created, the DBMS makes it possible to store and retrieve data from those database
files.
More specifically, a DBMS provides the following functions:
 Concurrency: concurrent access (meaning 'at the same time') to the same database by
multiple users
 Security: security rules to determine access rights of users
 Backup and recovery: processes to back-up the data regularly and recover data
if aproblem occurs
 Integrity: database structure and rules improve the integrity of the data
 Data descriptions: a data dictionary provides a description of the data
Characteristics of DBMS
There are various characteristics of a database management system, but following are some
important characteristics:
 A database management system (DBMS) should be able to store any kind of data
in adatabase.
 Any database management system should be able to support ACID (atomicity,
consistency, isolation, durability) properties.
 The Database management system allows more than one users to access the same
database at the same time.
 Backup and recovery are the two main methods that allow users to protect their data
fromdamage or loss.
 It provides multiple views for different users in one organization.
 DBMS follows the concept of normalization to minimize the redundancy of a relation.
 It provides users query language, using which they can easily insert, retrieve, update,
and delete the data in a database.
What is File Management System?
A file management system is a collection of programs that manage and store data in files and
folders in a computer hard disk. A file management system manages the way of reading and
writing data to the hard disk. It is also known as conventional file system.
This system actually stores data in the isolated files which have their own physical location
on the drive, and users manually go to these locations to access these files. It is the easiest
way to store the data like text, videos, images, audios, etc. in general files. Data redundancy
is high in file management system, and it cannot be controlled easily. Data consistency is not
met, and the integration of data is hard to achieve.
Operating System such as Linux and Windows has its own file system. For
example, NTFS is the Windows file system, and EXT is the Linux file
These operating systems provide less security to these files where they have options such as
hidefiles, locks, and sharing on files.
Drawbacks of File system
 Data redundancy: Data redundancy refers to the duplication of data, lets say we are
managing the data of a college where a student is enrolled for two courses, the same
student details in such case will be stored twice, which will take more storage than
needed. Data redundancy often leads to higher storage costs and poor access time.
 Data inconsistency: Data redundancy leads to data inconsistency, lets take the same
example that we have taken above, a student is enrolled for two courses and we have
student address stored twice, now lets say student requests to change his address, if the
address is changed at one place and not on all the records then this can lead to data
inconsistency.
 Data Isolation: Because data are scattered in various files, and files may be in different
formats, writing new application programs to retrieve the appropriate data is difficult.
 Dependency on application programs: Changing files would lead to change in
application programs.
 Atomicity issues: Atomicity of a transaction refers to “All or nothing”, which means
either all the operations in a transaction executes or none.
 Data Security: Data should be secured from unauthorised access, for example a student
in a college should not be able to see the payroll details of the teachers, such kind of
security constraints are difficult to apply in file processing systems.
Advantage of DBMS over file system
There are several advantages of Database management system over file system. Few of them
are as follows:
 No redundant data: Redundancy removed by data normalization. No data duplication
saves storage and improves access time.
 Data Consistency and Integrity: As we discussed earlier the root cause of data
inconsistency is data redundancy, since data normalization takes care of the data
redundancy, data inconsistency also been taken care of as part of it
 Data Security: It is easier to apply access constraints in database systems so that only
authorized user is able to access the data. Each user has a different set of access thus data
is secured from the issues such as identity theft, data leaks and misuse of data.
 Privacy: Limited access means privacy of data.
 Easy access to data – Database systems manages data in such a way so that the data is
easilyaccessible with fast response times.
 Easy recovery: Since database systems keep the backup of data, it is easier to do a
fullrecovery of data in case of a failure.
 Flexible: Database systems are more flexible than file processing systems.
 Minimal data redundancy or data duplicacy.
 Easy access to data from the database using the query language.
 DBMS provides backup and recovery methods which create an automatic backup of
datafrom software and hardware failures and restores the data if required.
 Minimized data consistency.
 Better data integration.
 DBMS can applies integrity constraint to the data in the database.
 DBMS increases consistency and reduces updating errors.
Disadvantages of DBMS:
 DBMS implementation cost is high compared to the file system
 Complexity: Database systems are complex to understand
 Performance: Database systems are generic, making them suitable for various
applications.However this feature affect their performance for some applications
 A database management system is complex and time consuming to design.
 Cost of software and hardware is high to run DBMS software.
 DBMS consumes a large amount of main memory as well as a huge amount of disk
space tomake it run efficiently.
 If the database is damaged because of any software or hardware failure, all the
applicationprograms will be implicitly affected, which are dependent on it.
 Initial training is required for all the users and programmers to use the DBMS software.

Difference between File System and Database Management System

File System Database Management System (DBMS)

1. It is a software system that manages and 1. It is a software system used for creating and managing
controls the data files in a computer system. the databases. DBMS provides a systematic way to
access, update, and delete data.

2. File system does not support multi-user 2. Database Management System supports multi-user
access. access.

3. Data consistency is less in the file system. 3. Data consistency is more due to the use of
normalization.

4. File system is not secured. 4. Database Management System is highly secured.

5. File system is used for storing the 5. Database management system is used for storing the
unstructured data. structured data.

6. In the file system, data redundancy is high. 6. In DBMS, Data redundancy is low.
7. No data backup and recovery process is 7. There is a backup recovery for data in DBMS.
present in a file system.

8. Handling of a file system is easy. 8. Handling a DBMS is complex.

9. Cost of a file system is less than the 9. Cost of database management system is more than the
DBMS. file system.

10. If one application fails, it does not affect 10. If the database fails, it affects all application which
other application in a system. depends on it.

11. In the file system, data cannot be shared 11. In DBMS, data can be shared as it is stored at one
because it is distributed in different files. place in a database.

12. These system does not provide 12. This system provides concurrency facility.
concurrency facility.

13. Example: NTFS (New technology file 13. Example: Oracle, MySQL, MS SQL Server, DB2,
system), EXT (Extended file system), etc. Microsoft Access, etc.

Components of DBMS
Components of DBMS: There are the following components of DBMS:
1. Software
2. Hardware
3. Procedures
4. Data
5. Users

Software
 The main component of a Database management system is the software. It is the set of
programs which is used to manage the database and to control the overall
computerized database.
 The DBMS software provides an easy-to-use interface to store, retrieve, and update
data in the database.
 This software component is capable of understanding the Database Access Language
andconverts it into actual database commands to execute or run them on the database.
Hardware
 This component of DBMS consists of a set of physical electronic devices such as
computers, I/O channels, storage devices, etc that create an interface between
computers and the users.
 This DBMS component is used for keeping and storing the data in the database.
Procedures
 Procedures refer to general rules and instructions that help to design the database and
to use a database management system.
 Procedures are used to setup and install a new database management system
(DBMS), to login and logout of DBMS software, to manage DBMS or application
programs, to take backup of the database, and to change the structure of the database,
etc.
Data
 It is the most important component of the database management system.
 The main task of DBMS is to process the data. Here, databases are defined,
constructed,and then data is stored, retrieved, and updated to and from the databases.
 The database contains both the metadata (description about data or data about data)
andthe actual (or operational) data.
Users
 The users are the people who control and manage the databases and perform
differenttypes of operations on the databases in the database management system.
There are three types of user who play different roles in DBMS:
 Application Programmers
 Database Administrators
 End-Users
1. Application Programmers
The users who write the application programs in programming languages (such as Java,
C++,or Visual Basic) to interact with databases are called Application Programmer.
2. Database Administrators (DBA)
A person who manages the overall DBMS is called a database administrator or simply DBA.
3. End-Users
The end-users are those who interact with the database management system to perform
different operations by using the different database commands such as insert,
update, retrieve, and delete on the data, etc.
Purpose of Database Systems:

This typical file-processing system is supported by a conventional operating system. The system stores
permanent records in various files, and it needs different application programs to extract records from, and
add records to, the appropriate files. Before database management systems (DBMSs) were introduced,
organizationsusually stored information in such systems.
Keeping organizational information in a file-processing system has a number of major disadvantages:
Drawbacks of File system
 Data redundancy: Data redundancy refers to the duplication of data, lets say we are
managing the data of a college where a student is enrolled for two courses, the same student
details in such case will be stored twice, which will take more storage than needed. Data
redundancy often leads to higher storage costs and poor access time.
 Data inconsistency: Data redundancy leads to data inconsistency, lets take the same
example that we have taken above, a student is enrolled for two courses and we have student
address stored twice, now lets say student requests to change his address, if the address is
changed atone place and not on all the records then this can lead to data inconsistency.
 Data Isolation: Because data are scattered in various files, and files may be in different
formats, writing new application programs to retrieve the appropriate data is difficult.
 Dependency on application programs: Changing files would lead to change in application
programs.
 Atomicity issues: Atomicity of a transaction refers to “All or nothing”, which means either
all the operations in a transaction executes or none.

 Data Security: Data should be secured from unauthorised access, for example a student in a
college should not be able to see the payroll details of the teachers, such kind of security
constraints are difficult to apply in file processing systems.
The main purpose of database systems is to manage the data. Consider a university that keeps
the data of students, teachers, courses, books etc. To manage this data we need to store this data
somewhere where we can add new data, delete unused data, update outdated data, retrieve data, to
perform these operations on data we need a Database management system that allows us to store
the data in such a way so that all these operations can be performed on the data efficiently

Main purposes of database systems are:


1. Less prone to issues: Information stored on databases are less prone to disasters as
compared to information on paper files. With the advent of cloud computing, now your
data is stored and replicated over multiple servers in disparate locations, which would
essentiallymake it invulnerable to any natural or human made disaster.
2. Scalability: You can create a custom database for your business and store information
in it as your business grows. You can even migrate this information to another bigger
database when you feel the current database can‟t handle your information. You may
even change the way information is stored on your database to keep up with the ever
changing business needs. Such type of scalability is not possible for manual files.
3. Data Security: Another purpose of using a database system is data security. You can put
security on your data which can make it unavailable to unauthorized people. There are
numerous techniques to do that. Some database systems even provide you multiple
layers of security to keep your data off from the hands of an unauthorized person. From a
simple password on your excel file to the advanced data encryption techniques available
in Oracle, this is definitely a feature which will convince you to switch over here from
manual files.
4. Computing capabilities: Modern database systems are not just the places where you just
store information. They also offer advanced computing facilities where you use inbuilt
calculations and aggregations. Such things come in handy when you need to do data
mining or have to generate last minute reports on your data.

5. Easy to maintain: Database are easy to maintain as compared to traditional ways of storing
information/manual files. The housekeeping and maintenance cost is almost 5-10% of
the same required in keeping and maintaining a manual files system. Faster data access
alsogives it an edge over traditional systems.
Purpose of Database Systems
1. To see why database management systems are necessary, let's look at a typical
``file-processing system'' supported by a conventional operating system.

The application is a savings bank:

o Savings account and customer records are kept in permanent system files.
o Application programs are written to manipulate files to perform the
followingtasks:
 Debit or credit an account.
 Add a new account.
 Find an account balance.
 Generate monthly statements.
2. Development of the system proceeds as follows:
o New application programs must be written as the need arises.
o New permanent files are created as required.
o but over a long period of time files may be in different formats, and
o Application programs may be in different languages.
3. So we can see there are problems with the straight file-processing approach:
o Data redundancy and inconsistency
 Same information may be duplicated in several places.
 All copies may not be updated properly.
o Difficulty in accessing data
 May have to write a new application program to satisfy an unusual
request.
 E.g. find all customers with the same postal code.
 Could generate this data manually, but a long job...
o Data isolation
 Data in different files.
 Data in different formats.
 Difficult to write new application programs.
o Multiple users
 Want concurrency for faster response time.
 Need protection for concurrent updates.
 E.g. two customers withdrawing funds from the same account at the
same time - account has $500 in it, and they withdraw $100 and $50.
The result could be $350, $400 or $450 if no protection.
o Security problems
 Every user of the system should be able to access only the data they
are permitted to see.

 E.g. payroll people only handle employee records, and cannot see
customer accounts; tellers only access account data and cannot see
payroll data.
 Difficult to enforce this with application programs.
o Integrity problems
 Data may be required to satisfy constraints.
 E.g. no account balance below $25.00.
 Again, difficult to enforce or to change constraints with the file-
processingapproach.

These problems and others led to the development of database management systems.
Applications of DBMS
Applications of DBMS: There are various fields where a database management system is
used.Following are some applications which make use of the database management system:

1. Railway Reservation System: In the railway reservation system, the database is required
to store the record or data of ticket bookings, status about train‟s arrival, and departure. Also
if trains get late, people get to know it through database update.
2. Library Management System: There are lots of books in the library so; it is tough to
store the record of all the books in a register or copy. So, the database management system
(DBMS) is used to maintain all the information related to the name of the book, issue date,
availability of thebook, and its author.
3. Banking: Database management system is used to store the transaction information of the
customer in the database.
4. Education Sector: Presently, examinations are conducted online by many colleges and
universities. They manage all examination data through the database management system
(DBMS). Inspite that student‟s registrations details, grades, courses, fee, attendance, results,
etc. all the information is stored in the database.
5. Credit card transactions: Database Management system is used for purchasing on credit
cards and generation of monthly statements.
6. Social Media Sites: We all use of social media websites to connect with friends and to
share our views with the world. Daily, millions of peoples sign up for these social media
accounts like Pinterest, Facebook, Twitter, and Google plus. By the use of the database
management system, all the information of users are stored in the database and, we become
able to connect with other people.
7. Telecommunications: Without DBMS any telecommunication company can‟t think. The
Database management system is necessary for these companies to store the call details and
monthly postpaid bills in the database.
8. Finance: The database management system is used for storing information about sales,
holdingand purchases of financial instruments such as stocks and bonds in a database.
9. Online Shopping: These days, online shopping has become a big trend. No one wants to
visit the shop and waste their time. Everyone wants to shop through online shopping websites
(such as Amazon, Flipkart, snapdeal) from home. So all the products are sold and added
only with the help of the database management system (DBMS). Invoice bills, payments,
purchase informationall of these are done with the help of DBMS.
10. Human Resource Management: Big firms or companies have many workers or
employees working under them. They store information about employee‟s salary, tax, and
work with the help of database management system (DBMS).
11. Manufacturing: Manufacturing companies make different types of products and sale
them on a daily basis. In order to keep the information about their products like bills,
purchase of the product, quantity, supply chain management, database management system
(DBMS) is used.
12. Airline Reservation System: This system is the same as the railway reservation system.
This system also uses a database management system to store the records of flights departure,
arrival, and delay status.
Data Abstraction in DBMS
We have three levels of abstraction:
Data abstraction is a process of hiding the implement details (such as how the data are
stored and maintained) and representing only the essential features to simplify user's
interaction withthe system.
The major purpose of a database system is to provide users with an abstract view of the system.

LEVELS OF ABSTRACTION
To simplify user's interaction with the system, the complexity is hidden from the database users
through several levels of abstraction.
Physical Level:
 Lowest level of abstraction.
 Describes how the data are stored.
 Complex low-level data structures described in detail.
 E.g: index, B-tree, hashing.
Logical / Conceptual Level:
 Next highest level of abstraction.
 Describes what data are stored and what relationships exit among those data.
 Database administrator level.
View Level:
 Highest level of abstraction.
 Describes only part of the database for a particular group of users.
 Can be many different views of a database.
Structure of DBMS and Database Structure of Database Management System

 Applications: – It can be considered as a user-friendly web page where the user enters
the requests. Here he simply enters the details that he needs and presses buttons to get the
data.
 End User: – They are the real users of the database. They can be developers, designers,
administrators, or the actual users of the database.
 DDL: – Data Definition Language (DDL) is a query fired to create database, schema,
tables, mappings, etc in the database. These are the commands used to create objects like
tables, indexes in the database for the first time. In other words, they create the structure
of the database.
 DDL Compiler: – This part of the database is responsible for processing the DDL
commands. That means this compiler actually breaks down the command into machine-
understandable codes. It is also responsible for storing the metadata information like table
name, space used by it, number of columns in it, mapping information, etc.
 DML Compiler: – When the user inserts, deletes, updates or retrieves the record from
the database, he will be sending requests which he understands by pressing some buttons.
But for the database to work/understand the request, it should be broken down to object
code. This is done by this compiler. One can imagine this as when a person is asked some
question, how this is broken down into waves to reach the brain!
 Query Optimizer: – When a user fires some requests, he is least bothered how it will be
fired on the database. He is not all aware of the database or its way of performance. But
whatever be the request, it should be efficient enough to fetch, insert, update, or delete
the data from the database. The query optimizer decides the best way to execute the user
request which is received from the DML compiler. It is similar to selecting the best nerve
to carry the waves to the brain!
 Stored Data Manager: – This is also known as Database Control System. It is one of the
main central systems of the database. It is responsible for various tasks
 It converts the requests received from query optimizer to machine-understandable
form. It makes actual requests inside the database. It is like fetching the exact part of
the brain to answer.
 It helps to maintain consistency and integrity by applying the constraints. That means
it does not allow inserting/updating / deleting any data if it has child entry. Similarly,
it does not allow entering any duplicate value into database tables.
 It controls concurrent access. If there are multiple users accessing the database at the
same time, it makes sure, all of them see correct data. It guarantees that there is no
data loss or data mismatch happens between the transactions of multiple users.
 It helps to back up the database and recovers data whenever required. Since it is a
huge database and when there is any unexpected exploit of the transaction, and
reverting the changes is not easy. It maintains the backup of all data so that it can be
recovered.
 Data Files: – It has the real data stored in it. It can be stored as magnetic tapes, magnetic
disks, or optical disks.
 Compiled DML: – Some of the processed DML statements (insert, update, delete) are
stored in it so that if there are similar requests, it will be re-used.
 Data Dictionary: – It contains all the information about the database. As the name
suggests, it is the dictionary of all the data items. It contains a description of all
the tables, view, materialized views, constraints, indexes, triggers, etc.
Structure of Relational model
Domains: A domain is a set of values permitted for an attribute in a table. Domain is atomic.
For example, age can only be a positive integer. A data type or format is also specified for
each domain. It is possible for several attributes to have the same domain. The data type for
Employeeages is an integer number. Some examples of domains follow:
■ Mobile numbers, The set of ten-digit phone numbers is valid
■ Local_phone_number: The set of seven-digit phone numbers valid within a
particulararea code
■ Social_security_numbers: The set of valid nine-digit Social Security numbers.
Attribute: Each column in a Table. Attributes are the properties which define a relation.
e.g.,Student_Rollno, NAME, etc.
Tuple – It is nothing but a single row of a table, which contains a single record.
Relations- are in the table format. It is stored along with its entities. A table has two properties
rows and columns. Rows represent records and columns represent attributes.
Relation schema- A relational schema is the design for the table. It includes none of the
actual data, but is like a blueprint or design for the table, so describes what columns are on
the table and the data types. It may show basic table constraints ( e.g. if a column can be null)
but not how it relates to other tables.
A relation schema R, denoted by R (A1,A2, ..., An), is made up of a relation name R and a
list of attributes, A1,A2, ...,An. Each attribute Ai is the name of a role played by some domain
D in the relation schema R. D is called the domain of Ai and is denoted by dom(Ai). The
relation schema R(A1,A2, ...,An), also denoted by r(R), is a set of n -tuples r= {t1,t2, ...,tm}.
Degree- of a relation is the number of attributes n of its relation schema. A relation of degree
seven, which stores information about university students, would contain seven attributes
describing each student. as follows:
Cardinality: Total number of rows present in the Table.
Relation instance – Relation instance is a finite set of tuples at a given time. The current
relationstate reflects only the valid tuples that represent a particular state of the real world.
Null value: A field with a NULL value is a field with no value. Primary key can‟t be a null
value.

Relational Model Concepts


1. Attribute: Each column in a Table. Attributes are the properties which define a relation.
e.g., Student_Rollno, NAME,etc.
2. Tables – In the Relational model the, relations are saved in the table format. It is stored
along with its entities. A table has two properties rows and columns. Rows represent
records and columns represent attributes.
3. Tuple – It is nothing but a single row of a table, which contains a single record.
4. Relation Schema: A relation schema represents the name of the relation with its
attributes.
5. Degree: The total number of attributes which in the relation is called the degree of the
relation.
6. Cardinality: Total number of rows present in the Table.
7. Column: The column represents the set of values for a specific attribute.
8. Relation instance – Relation instance is a finite set of tuples in the RDBMS system.
Relation instances never have duplicate tuples.
9. Relation key - Every row has one, two or multiple attributes, which is called relation
key.
10. Attribute domain – Every attribute has some pre-defined value and scope which is
known as attribute domain
Relation Schema and Instance:
A1, A2, …, An are attributes
R = (A1, A2, …, An ) is a relation
schemaExample:
instructor = (ID, name, dept_name, salary)
A relation instance r defined over schema R is denoted by r (R).
The current values a relation are specified by a table
An element t of relation r is called a tuple and is represented by a row in a table
Relational Model: Attributes and Domains
 The set of allowed values for each attribute is called the domain of the attribute
 Attribute values are (normally) required to be atomic; that is, indivisible
 The special value null is a member of every domain. Indicated that the value is “unknown”
 The null value causes complications in the definition of many operations

Operations in Relational Model


Four basic update operations performed on relational database model
areInsert, update, delete and select.
 Insert is used to insert data into the relation
 Delete is used to delete tuples from the table.
 Modify allows you to change the values of some attributes in existing tuples.
 Select allows you to choose a specific range of data.
Whenever one of these operations are applied, integrity constraints specified on the
relationaldatabase schema must never be violated.
Insert Operation
The insert operation gives values of the attribute for a new tuple which should be inserted
into arelation.

Update Operation
You can see that in the below-given relation table CustomerName= 'Apple' is updated
fromInactive to Active.

Delete Operation
To specify deletion, a condition on the attributes of the relation selects the tuple to be deleted.

In the above-given example, CustomerName= "Apple" is deleted from the table.


The Delete operation could violate referential integrity if the tuple which is deleted is referenced by foreign keys
from other tuples in the same database.
Select Operation
What is Relational Algebra?
Every database management system must define a query language to allow users to access the data
stored in the database. Relational Algebra is a procedural query language used to query the
database tables to access data in different ways.
In relational algebra, input is a relation(table from which data has to be accessed) and output is
also a relation(a temporary table holding the data asked for by the user).

Relational Algebra works on the whole table at once, so we do not have to use loops etc to iterate
over all the rows(tuples) of data one by one. All we have to do is specify the table name from
which we need the data, and in a single line of command, relational algebra will traverse the
entire given table to fetch data for you.
Relational Algebra
Relational algebra is a procedural query language, which takes instances of relations as input and
yields instances of relations as output. It uses operators to perform queries. An operator can be
either unary or binary. They accept relations as their input and yield relations as their output.
Relational algebra is performed recursively on a relation and intermediate results are also
considered relations.
The fundamental operations of relational algebra are as follows −
 A procedural language consisting of a set of operations that take one or two relations
asinput and produce a new relation as their result.
 Six basic operators
select:
project:
union:
set difference: –
Cartesian product: x
rename:
Relational Calculus
In contrast to Relational Algebra, Relational Calculus is a non-procedural query language,
that is,it tells what to do but never explains how to do it.
Relational calculus exists in two
forms −Tuple Relational Calculus
(TRC) Filtering variable ranges over
tuples Notation − {T | Condition}
Returns all tuples T that satisfies a condition.
For example −
{T.name | Author (T) AND T.article = „database‟}
Output − Returns tuples with 'name' from Author who has written article on 'database'.
TRC can be quantified. We can use Existential (∃) and Universal Quantifiers (∀).
For example −
{ R| ∃T ∈ Authors(T.article='database' AND R.name=T.name)}
Output − The above query will yield the same result as the previous
one.Domain Relational Calculus (DRC)
In DRC, the filtering variable uses the domain of attributes instead of entire tuple values (as
donein TRC, mentioned above).
Notation −
{ a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}
Where a1, a2 are attributes and P stands for formulae built by inner attributes.
For example −
{< article, page, subject > | ∈ TutorialsPoint 𝖠 subject = 'database'}
Output − Yields Article, Page, and Subject from the relation TutorialsPoint, where subject is
database.
Just like TRC, DRC can also be written using existential and universal quantifiers. DRC also
involves relational operators.
The expression power of Tuple Relation Calculus and Domain Relation Calculus is
equivalent to Relational Algebra.
\Select Operation:
o The select operation selects tuples that satisfy a given predicate.
o It is denoted by sigma (σ).
1. Notation: σ p(r)
Where:
σ is used for selection prediction
r is used for relation
p is used as a propositional logic formula which may use connectors like: AND OR and
NOT.These relational can use as relational operators like =, ≠, ≥, <, >, ≤.
For example: LOAN Relation

BRANCH_NAME LOAN_NO AMOUNT

Downtown L-17 1000

Redwood L-23 2000

Perryride L-15 1500


Downtown L-14 1500

Mianus L-13 500

Roundhill L-11 900

Perryride L-16 1300


Input:
1. σ BRANCH_NAME="perryride" (LOAN)
Output:

BRANCH_NAME LOAN_NO AMOUNT

Perryride L-15 1500

Perryride L-16 1300

2. Project Operation:
o This operation shows the list of those attributes that we wish to appear in the result.
Restof the attributes is eliminated from the table.
o It is denoted by ∏.
1. Notation: ∏ A1, A2, An (r)
Where
A1, A2, A3 is used as an attribute name of relation r.

Example: CUSTOMER RELATION

NAME STREET CITY

Jones Main Harrison

Smith North Rye

Hays Main Harrison

Curry North Rye

Johnson Alma Brooklyn

Brooks Senator Brooklyn


Input:
1. ∏ NAME, CITY (CUSTOMER)
Output:

NAME CITY

Jones Harrison

Smith Rye

Hays Harrison

Curry Rye

Johnson Brooklyn

Brooks Brooklyn

3. Union Operation:
o Suppose there are two tuples R and S. The union operation contains all the tuples that
areeither in R or S or both in R & S.
o It eliminates the duplicate tuples. It is denoted by 𝖴.
1. Notation: R 𝖴 S
A union operation must hold the following condition:
o R and S must have the attribute of the same number.
o Duplicate tuples are eliminated
automatically.
o
Example:
DEPOSITOR RELATION

CUSTOMER_NAME ACCOUNT_NO

Johnson A-101

Smith A-121

Mayes A-321

Turner A-176

Johnson A-273

Jones A-472

Lindsay A-284
BORROW RELATION

CUSTOMER_NAME LOAN_NO

Jones L-17

Smith L-23

Hayes L-15

Jackson L-14

Curry L-93

Smith L-11

Williams L-17
Input:
1. ∏ CUSTOMER_NAME (BORROW) 𝖴 ∏ CUSTOMER_NAME (DEPOSITOR)
Output:

CUSTOMER_NAME

Johnson

Smith

Hayes

Turner

Jones

Lindsay

Jackson

Curry

Williams

Mayes

4. Set Intersection:
o Suppose there are two tuples R and S. The set intersection operation contains all
tuplesthat are in both R & S.
o It is denoted by
intersection ∩.Notation: R
∩S
Example: Using the above DEPOSITOR table and BORROW table
Input:
∏ CUSTOMER_NAME (BORROW) ∩ ∏ CUSTOMER_NAME (DEPOSITOR)
Output:

CUSTOMER_NAME

Smith

Jones

5. Set Difference:
o Suppose there are two tuples R and S. The set intersection operation contains all
tuplesthat are in R but not in S.
o It is denoted by intersection minus
(-).Notation: R - S
Input:
1. ∏ CUSTOMER_NAME (BORROW) - ∏ CUSTOMER_NAME (DEPOSITOR)
Output:
CUSTOMER_NAME
Jackson

Hayes

Willians

Curry
6. Cartesian product
o The Cartesian product is used to combine each row in one table with each row in
theother table. It is also known as a cross product.
o It is denoted by X.
1. Notation: E
X DExample:
EMPLOYEE

EMP_ID EMP_NAME EMP_DEPT

1 Smith A

2 Harry C

3 John B
DEPARTMENT

DEPT_NO DEPT_NAME

A Marketing
B Sales
C Legal
Input:
EMPLOYEE X DEPARTMENT
Output:

EMP_ID EMP_NAME EMP_DEPT DEPT_NO DEPT_NAME

1 Smith A A Marketing

1 Smith A B Sales

1 Smith A C Legal

2 Harry C A Marketing

2 Harry C B Sales

2 Harry C C Legal

3 John B A Marketing

3 John B B Sales

3 John B C Legal

7. Rename Operation:
The rename operation is used to rename the output relation. It is denoted by rho (ρ).
Example: We can use the rename operator to rename STUDENT relation to STUDENT1.
1. ρ(STUDENT1, STUDENT)

What is Relational Calculus?


Relational calculus is a non-procedural query language that tells the system what data to
beretrieved but doesn‟t tell how to retrieve it.
Types of Relational Calculus
1. Tuple Relational Calculus (TRC)
Tuple relational calculus is used for selecting those tuples that satisfy the given condition.

Table: Student
First_Name Last_Name Age
Ajeet Singh 30
Chaitanya Singh 31
Rajeev Bhatia 27
Carl Pratap 28

Lets write relational calculus queries.


Query to display the last name of those students where age is greater than 30
{ t.Last_Name | Student(t) AND t.age > 30 }
In the above query you can see two parts separated by | symbol. The second part is where we
define the condition and in the first part we specify the fields which we want to display for
the selected tuples.
The result of the above query would be:
Last_N
ame
Singh
Query to display all the details of students where Last name is „Singh‟
{ t | Student(t) AND t.Last_Name =
'Singh' }Output:
First_Name Last_Name Age

Ajeet Singh 30

Chaitanya Singh 31

2. Domain Relational Calculus (DRC)


In domain relational calculus the records are filtered based on the domains.
Again we take the same table to understand how DRC works.

Table: Student
First_Name Last_Name Age
Ajeet Singh 30
Chaitanya Singh 31
Rajeev Bhatia 27
Carl Pratap 28

Query to find the first name and age of students where student age is greater than 27
{< First_Name, Age > | ∈ Student 𝖠 Age >
27}Note:
The symbols used for logical operators are: 𝖠 for AND, ∨ for OR and ┓ for NOT.
Output:
First_Name Age
Ajeet 30
Chaitanya 31
Carl 28

Entity Relationship Model/Diagram


What is ER Diagram?
ER Diagram stands for Entity Relationship Diagram, also known as ERD is a diagram that
displays the relationship of entity sets stored in a database. In other words, ER diagrams help
to explain the logical structure of databases. ER diagrams are created based on three basic
concepts:entities, attributes and relationships.
ER Diagrams contain different symbols that use rectangles to represent entities, ovals to
define attributes and diamond shapes to represent relationships.
What is ER Model?
ER Model stands for Entity Relationship Model is a high-level conceptual data model
diagram. ER model helps to systematically analyze data requirements to produce a well-
designed database. The ER Model represents real-world entities and the relationships
between them. Creating an ER Model in DBMS is considered as a best practice before
implementing your database.
ER model
o ER model stands for an Entity-Relationship model. It is a high-level data model.
Thismodel is used to define the data elements and relationship for a specified system.
o It develops a conceptual design for the database. It also develops a very simple and
easyto design view of data.
o In ER modeling, the database structure is portrayed as a diagram called an entity-
relationship diagram.
For example, Suppose we design a school database. In this database, the student will be an
entity with attributes like address, name, id, age, etc. The address can be another entity with
attributes like city, street name, pin code, etc and there will be a relationship between them.
Components of ER Diagram:
It comprises:
 Entity: Any object that can have data stored in it.

 Relationships between entities: Defines how the entities are associated or related
witheach other.
 Attributes of entities & relationships: Represents the characteristic or property of
anentity.
Component of ER Diagram

WHAT IS ENTITY?
A real-world thing either living or non-living that is easily recognizable and non-
recognizable. It is anything in the enterprise that is to be represented in our database. It may
be a physical thing or simply a fact about the enterprise or an event that happens in the real
world. An entity can be place, person, object, event or a concept, which stores data in the
database. The characteristics of entities are must have an attribute, and a unique key. Every
entity is made up ofsome 'attributes' which represent that entity.
Examples of entities:
 Person: Employee, Student, Patient
 Place: Store, Building
 Object: Machine, product, and Car
 Event: Sale, Registration, Renewal
 Concept: Account, Course
1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can
berepresented as rectangles.
Consider an organization as an example- manager, product, employee, department etc. can
betaken as an entity.
a. Strong Entity
Strong Entity is independent to any other entity in the schema. A strong entity always has a
primary key. In ER diagram, a strong entity is represented by rectangle. Relationship between
two strong entities is represented by a diamond. A set of strong entities is known as strong
entity set.
b. Weak Entity
An entity that depends on another entity called a weak entity. The weak entity doesn't
containany key attribute of its own. The weak entity is represented by a double rectangle.

Sr. Key Strong Entity Weak Entity


No.

Key Strong entity always have one Weak entity have a foreign key
1 primarykey. referencing primary key of strong
entity.

Dependency Strong entity is independent of Weak entity is dependent on strong


2
otherentities. entity.

Represented by A strong entity is represented by A weak entity is represented bydouble


3
singlerectangle. rectangle.

Relationship Relationship between two strong Relationship between a strong and


4 Representatio entitiesis represented by single weak entity is represented by double
n diamond. diamond.

Participation Strong entity may or may not Weak entity always participates in
5
participatein entity relationships. entity relationships.

Strong Entity Set Weak Entity Set

Strong entity set always has a primary key. It does not have enough attributes to build a
primary key.
It is represented by a rectangle symbol. It is represented by a double rectangle symbol.

It contains a Primary key represented by the It contains a Partial Key which is represented by a
underline symbol. dashed underline symbol.

The member of a strong entity set is called as The member of a weak entity set called as a
dominant entity set. subordinate entity set.

Primary Key is one of its attributes which helps to In a weak entity set, it is a combination of primary
identify its member. key and partial key of the strong entity set.
In the ER diagram the relationship between two The relationship between one strong and a weak
strong entity set shown by using a diamond entity set shown by using the double diamond
symbol. symbol.

The connecting line of the strong entity set with The line connecting the weak entity set for
the relationship is single. identifying relationship is double.

2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent
anattribute.
For example, id, age, contact number, name, etc. can be attributes of a student.

a. Key Attribute
The key attribute is used to represent the main characteristics of an entity. It represents a
primarykey. The key attribute is represented by an ellipse with the text underlined.

b. Composite Attribute
An attribute that composed of many other attributes is known as a composite attribute.
Thecomposite attribute is represented by an ellipse, and those ellipses are connected with an
ellipse.
c. Multivalued Attribute
An attribute can have more than one value. These attributes are known as a multivalued
attribute.The double oval is used to represent multivalued attribute.
For example, a student can have more than one phone number.

d. Derived Attribute
An attribute that can be derived from other attribute is known as a derived attribute. It
can berepresented by a dashed ellipse.
For example, A person's age changes over time and can be derived from another attribute
likeDate of birth.
2. Relationship
A relationship is used to describe the relation between entities. Diamond or rhombus is used to
represent the relationship.

Cardinality in DBMS
In the case of Data Modeling, Cardinality defines the number of attributes in one entity
set, which can be associated with the number of attributes of other set via relationship set. In
simple words, it refers to the relationship one table can have with the other table. They can be
One-to- one, One-to-many, Many-to-one, Many-to-many.
Cardinality defines the number of entities in one entity set, which can be associated with the
number of entities of other set via relationship set.
In terms of data models, cardinality refers to the relationship between two tables.
Relationship can be of four types
Types of relationship are as follows:
a. One-to-One Relationship
When only one instance of an entity is associated with the relationship, then it is known as
one toone relationship.
For example, A female can marry to one male, and a male can marry to one female.

b. One-to-many relationship
When only one instance of the entity on the left, and more than one instance of an entity on
theright associates with the relationship then this is known as a one-to-many relationship.
For example, Scientist can invent many inventions, but the invention is done by the only
specific scientist.

c. Many-to-one relationship
When more than one instance of the entity on the left, and only one instance of an entity on
theright associates with the relationship then it is known as a many-to-one relationship.
For example, Student enrolls for only one course, but a course can have many students.
d. Many-to-many relationship
When more than one instance of the entity on the left, and more than one instance of an
entity onthe right associates with the relationship then it is known as a many-to-many
relationship.
For example, Employee can assign by many projects and project can have many employees.

Entity Relationship Diagram (ERD) Symbols and Notations

Notation of ER diagram
Database can be represented using the notations. In ER diagram, many notations are
used toexpress the cardinality. These notations are as follows:
Introduction to Database Keys
Keys are very important part of Relational database model. They are used to establish and
identify relationships between tables and also to uniquely identify any record or row of data
inside a table.
A Key can be a single attribute or a group of attributes, where the combination may act as a key.
Why we need a Key?
In real world applications, number of tables required for storing the data is huge, and the
different tables are related to each other as well.
Also, tables store a lot of data in them. Tables generally extends to thousands of records
stored inthem, unsorted and unorganized.
Now to fetch any particular record from such dataset, you will have to apply some
conditions, but what if there is duplicate data present and every time you try to fetch some
data by applying certain condition, you get the wrong data. How many trials before you get
the right data?
To avoid all this, Keys are defined to easily identify any row of data in a table.
Let's try to understand about all the keys using a simple example.
Let's take a simple Student table, with fields student_id, name, phone and age.

Student_id name phone age


1 Akon 9876723452 17
2 Akon 9991165674 19
3 Bkon 7898756543 18
4 Ckon 8987867898 19
5 Dkon 9990080080 17
Super Key
Super Key is defined as a set of attributes within a table that can uniquely identify each record
within a table. Super Key is a superset of Candidate key.
In the table defined above super key would include student_id, (student_id, name), phone etc.
Confused? The first one is pretty simple as student_id is unique for every row of data,
hence itcan be used to identity each row uniquely.
Next comes, (student_id, name), now name of two students can be same, buttheir studen
Similarly, phone number for every student will be unique, hence again; phone can also be a
key.So they all are super keys.
Candidate Key
Candidate keys are defined as the minimal set of fields which can uniquely identify each
record in a table. It is an attribute or a set of attributes that can act as a Primary Key for a
table to uniquely identify each record in that table. There can be more than one candidate key.
In our example, student_id and phone both are candidate keys for table Student.
 A candidate key can never be NULL or empty. And its value should be unique.
 There can be more than one candidate keys for a table.
 A candidate key can be a combination of more than one column (attributes).
Primary Key
Primary Key is a column or a combination of columns in a relationship that helps us in
uniquely identifying a row in that particular table. There can be no duplicates in a Primary
Key meaning that there can be no two same values in the table. We have a few rules for
choosing a key as the Primary Key. They are:
 Primary Key field cannot be left NULL and it is necessary for the Primary Key
column tohold a value.
 Any two rows in the table cannot have identical values for that column.
 In case a foreign key refers to the primary key, then no value in this primary key
column can be altered or modified.
The set of attributes that can uniquely identify a tuple is known as Super Key. A primary key
is a candidate key that the database designer selects while designing the database. OR The
Candidatekey that the database designer implements is called as a primary key.
Primary key is a candidate key that is most appropriate to become the main key for any
table. Itis a key that can uniquely identify each record in a table.
For the table Student we can make the student_id column as the primary key.
Composite Key
Key that consists of two or more attributes that uniquely identify any record in a table is
called Composite key. But the attributes which together form the Composite key are not a
key independently or individually.

In the above diagram we have a Score table which stores the marks scored by a student in a
particular subject.
In this table student_id and subject_id together will form the primary key, hence it is a
compositekey.
Alternate Key:
The candidate key other than the primary key is called an alternate key. For Example,
student_id, as well as phone both, are candidate keys for relation Student table but phone
will be an alternate key (only one out of many candidate keys).
Foreign Key
Foreign keys are the columns of a table that points to the primary key of another table. They
act as a cross-reference between tables.
Advantages of ER Diagram
These are enlisted below:
 Offers a visual representation of the overall structure.
 Aids database designers to create an efficient design/architecture.
 Helps to show the flow of data and the working of the entire system.
 Acts as a blueprint for the existing database.
 Aids in effective communication, as readers are able to understand relationships
amongdifferent fields.
 Provides flexibility in establishing and deriving relationships from the existing ones.
 Good support for DBMS.
Disadvantages Of ERD These are enlisted below:
 Expression is limited.
 Can be sometimes ambiguous.
 It may not be always concise.
 There are no industry-defined standards for documentation, so it may be confusing
sometimes.
 Hard to display information control.
 Some data may be lost or covered up.

Examples of ER diagram
ER diagram of Bank Management System
ER diagram is known as Entity-Relationship diagram. It is used to analyze to structure of the
Database. It shows relationships between entities and their attributes. An ER model provides
a means of communication.

ER diagram of Bank has the following description :


 Bank have Customer.
 Banks are identified by a name, code, address of main office.
 Banks have branches.
 Branches are identified by a branch_no., branch_name, address.
 Customers are identified by name, cust-id, phone number, address.
 Customer can have one or more accounts.
 Accounts are identified by acc_no., acc_type, balance.
 Customer can avail loans.
 Loans are identified by loan_id, loan_type and amount.
 Account and loans are related to bank‟s branch.
ER Diagram of Bank Management System :

Table for Bank


Name Code Address

Table for Branch


Branch_Id Name Address

Table for Loan


Loan_Id Loan_Type Amount

Table for Account Type


Loan_Id Loan_Type Amount

Customer Table:
Customer_id Name Address Phone
Entities and their Attributes are :
 Bank Entity : Attributes of Bank Entity are Bank Name, Code and Address.
Code is Primary Key for Bank Entity.
 Customer Entity: Attributes of Customer Entity are Customer_id, Name, Phone
Numberand Address. Customer_id is Primary Key for Customer Entity.
 Branch Entity: Attributes of Branch Entity are Branch_id, Name and Address.
Branch_id is Primary Key for Branch Entity.
 Account Entity: Attributes of Account Entity are Account_number, Account_Type
andBalance.
Account_number is Primary Key for Account Entity.
 Loan Entity: Attributes of Loan Entity are Loan_id, Loan_Type and Amount.
Loan_id is Primary Key for Loan Entity.
Relationships are:
 Bank has Branches => 1 : N
One Bank can have many Branches but one Branch can not belong to many Banks, so
therelationship between Bank and Branch is one to many relationship.
 Branch maintain Accounts => 1 : N
One Branch can have many Accounts but one Account can not belong to many
Branches, so the relationship between Branch and Account is one to many
relationship.
 Branch offer Loans => 1 : N
One Branch can have many Loans but one Loan can not belong to many Branches, so
therelationship between Branch and Loan is one to many relationship.
 Account held by Customers => M : N
One Customer can have more than one Accounts and also One Account can be held
by one or more Customers, so the relationship between Account and Customers is
many tomany relationship.

 Loan availed by Customer => M : N


(Assume loan can be jointly held by many Customers).
One Customer can have more than one Loans and also One Loan can be availed by
oneor more Customers, so the relationship between Loan and Customers is many to
many relationship.
ER diagram for College:

Table Structure ER diagram for College

ER Diagram Design Issues


Here are some of the issues that can occur while ER diagram design process:
1. Choosing Entity Set vs Attributes
Here we will discuss how choosing an entity set vs an attribute can change the whole ER
design semantics. To understand this lets take an example, let‟s say we have an entity set
Student with attributes such as student-name and student-id. Now we can say that the student-
id itself can be an entity with the attributes like student-class and student-section.
Now if we compare the two cases we discussed above, in the first case we can say that the
student can have only one student id, however in the second case when we chose student id
as anentity it implied that a student can have more than one student id.
2. Choosing Entity Set vs. Relationship Sets
It is hard to decide that an object can be best represented by an entity set or relationship set.
To comprehend and decide the perfect choice between these two (entity vs relationship), the
user needs to understand whether the entity would need a new relationship if a requirement
arise in future, if this is the case then it is better to choose entity set rather than relationship
set.
Let‟s take an example to understand it better: A person takes a loan from a bank, here we
have two entities person and bank and their relationship is loan. This is fine until there is a
need to disburse a joint loan, in such case a new relationship needs to be created to define the
relationship between the two individuals who have taken joint loan. In this scenario, it is
better tochoose loan as an entity set rather than a relationship set.

3. Choosing Binary vs n-ary Relationship Sets


In most cases, the relationships described in an ER diagrams are binary. The n-ary
relationships are those where entity sets are more than two, if the entity sets are only two,
their relationship can be termed as binary relationship.
The n-ary relationships can make ER design complex, however the good news is that we can
convert and represent any n-ary relationship using multiple binary relationships.
This may sound confusing so lets take an example to understand how we can convert an n-ary
relationship to multiple binary relationships. Now lets say we have to describe a relationship
between four family members: father, mother, son and daughter. This can easily be
represented in forms of multiple binary relationships, father-mother relationship as “spouse”,
son and daughter relationship as “siblings” and father and mother relationship with their child
as “child”.
4. Placing Relationship Attributes
The cardinality ratio in DBMS can help us determine in which scenarios we need to place
relationship attributes. It is recommended to represent the attributes of one to one or one to
many relationship sets with any participating entity sets rather than a relationship set.
For example, if an entity cannot be determined as a separate entity rather it is represented by
the combination of participating entity sets. In such case it is better to associate these entities
to many-to-many relationship sets.
Extended E-R Diagram
Importance Of Enhanced ER
DiagramThese are enlisted below:
 Enhanced Entity Relationship (EER) diagrams represent an expanded version of
ERdiagrams.
 EER models are more helpful while designing databases with high-level models.
 With more enhanced features, databases can be planned more efficiently by delving
intothe properties and constraints with much more precision.
 It aids you in having a more detailed look at your information.
 If the database would contain a larger amount of data, then it is advisable to switch to
theenhanced model to gain a more deep understanding of your model.
Extended E-R Features
The basic E-R concepts can model most database features, some aspects of a database may be
more aptly expressed by certain extensions to the basic E-R model. The extended E-R
features are specialization, generalization, higher- and lower-level entity sets, attribute
inheritance, and aggregation.
1. Specialization
An entity set may include subgroupings of entities that are distinct in some way from other
entities in the set. For instance, a subset of entities within an entity set may have attributes
that are not shared by all the entities in the entity set. The E-R model provides a means for
representing these distinctive entity groupings.
Consider an entity set person, with attributes name, street, and city. A person may be further
classified as one of the following:
• Customer
• employee
Each of these person types is described by a set of attributes that includes all the attributes of
entity set person plus possibly additional attributes. For example, customer entities may be
described further by the attribute customer-id, whereas employee entities may be described
further by the attributes employee-id and salary. The process of designating subgroupings
within an entity set is called specialization. The specialization of person allows us to
distinguish among persons according to whether they are employees or customers.
2. Generalization
The refinement from an initial entity set into successive levels of entity subgroupings
represents a top-down design process in which distinctions are made explicit. The design
process may also proceed in a bottom-up manner, in which multiple entity sets are
synthesized into a higher-level entity set on the basis of common features. The database
designer may have first identified a customer entity set with the attributes name, street, city,
and customer-id, and an employee entity set with the attributes name, street, city, employee-
id, and salary.
There are similarities between the customer entity set and the employee entity set in the
sense that they have several attributes in common. This commonality can be
expressed by generalization, which is a containment relationship that exists between a
higher-level entity set and one or more lower-level entity sets. In our example, person is the
higher-level entity set and customer and employee are lower-level entity sets. Higher- and
lower-level entity sets also may be designated by the terms superclass and subclass,
respectively. The person entity set is the superclass of the customer and employee subclasses.
For all practical purposes, generalization is a simple inversion of specialization.
3. Attribute Inheritance
A crucial property of the higher- and lower-level entities created by specialization and
generalization is attribute inheritance. The attributes of the higher-level entity sets are said to
be inherited by the lower-level entity sets. For example, customer and employee inherit the
attributes of person. Thus, customer is described by its name, street, and city attributes, and
additionally a customer-id attribute; employee is described by its name, street, and city
attributes,and additionally employee-id and salary attributes.
A lower-level entity set (or subclass) also inherits participation in the relationship sets in
which its higher-level entity (or superclass) participates. The officer, teller, and secretary
entity sets canparticipate in the works-for relationship set, since the superclass employee
participates in the works-for relationship. Attribute inheritance applies through all tiers of
lower-level entity sets. The above entity sets can participate in any relationships in which the
person entity set participates. Whether a given portion of an E-R model was arrived at by
specialization or generalization,the outcome is basically the same:
• A higher-level entity set with attributes and relationships that apply to all of its lower-level
entity sets
• Lower-level entity sets with distinctive features that apply only within a particular lower-
level entity set
Figure 2.17 depicts a hierarchy of entity sets. In the figure, employee is a lower-level entity
set of person and a higher-level entity set of the officer, teller, and secretary entity sets. In a
hierarchy, a given entity set may be involved as a lower-level entity set in only one ISA
relationship; that is, entity sets in this diagram have only single inheritance. If an entity set is a
lower-level entity set in more than one ISA relationship, then the entity set has multiple
inheritance, and the resulting structure is said to be a lattice.
Converting ER to Tables
 Convert Entity Sets, Relationships to tables
 Convert all attributes to columns
 Assign all Primary attributes of Entity Sets to Relationship table as columns
ER Diagram to Table Conversion
We have learned ER Diagram and ER design issues in previous articles. In this post, we
willcover how to convert ER diagram into database tables.
First we will convert simple ER diagrams to tables. In the end, we will take a complex
ERdiagram and then we will convert it into set of tables.
1. Strong Entity set with Simple attributes
The Strong Entity set becomes the table and the attributes of the Entity set becomes the table
attributes. The key attribute of the entity set becomes the primary key of the table.
Let‟s take an example: Here we have an entity set Employee with the attributes Name, Age,
Emp_Id and Salary. When we convert this ER diagram to table, the entity set becomes table
so we have a table named “Employee” as shown in the following diagram. The attributes of
the entity set becomes the attributes of the table.

2. Strong Entity Set With Composite Attributes


Now we will see how to convert Strong entity set with composite attributes ER to table. The
conversion is fairly simple in this case as well. The entity set will be the table and the simple
attributes of the composite attributes will become the attributes of the table while the
composite attribute itself will be ignored during conversion.
Let‟s take an example. As you can see we have a composite attribute Name and this
composite attribute has two simple attributes First_N and Last_N. While converting this
ER to table we have not used the composite attribute itself in the table instead we have used
the simple attributesof this composite attribute as table‟s attributes.
3. Strong Entity Set With Multi Valued Attributes
Entity set with multi-valued attributes will require two tables in the relational model.
We will understand this conversion with the help of a diagram. Let‟s take the same example
that we have seen above, here we have added a new multi-valued attribute Dept. An
employee can work in multiple department so we have this Dept attribute marked as multi-
valued. Whenever we have a multi-valued attribute, there needs to be more than one table to
represent the ER diagram. As you can see we have created two tables to represent this ER.

You might also like