0% found this document useful (0 votes)
9 views42 pages

Module 1 Final

DBMS VTU MODULE 1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views42 pages

Module 1 Final

DBMS VTU MODULE 1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Introduction to DBMS

MODULE 1

INTRODUCTION TO DATABASE MODELLING

MODULE 1

Databases and Database Overview of Database Conceptual Data Modelling


Users Language and Architectures using Entities and
1. Data Models, Schemas, Relationships
1. Introduction and Instances 1. Entity Types, Entity Sets,
2. An Example 2. Three-Schema Attributes, roles
3. Characteristics of the Architecture and Data 2. Relationship Types,
Database Approach Independence Relationship Sets, Roles,
4. Advantages of Using the 3. Database Languages and and Structural
DBMS Approach Interfaces Constraints
5. A Brief History of 4. The Database System 3. Weak Entity Types
Database Applications Environment 4. Refining the ER Design
6. Next Gen for the COMPANY
Databases(Graph DB, Database
Vector DB) 5. ER Diagrams, Naming
Conventions, and Design
Issues
6. Example

Sudarsanan D Assistant Professor, CITECH-ISE. Page 1


Introduction to DBMS

1.1 INTRODUCTION

“Good decisions require good information that is derived from raw facts”
These raw facts are known as data. Data are likely to be managed most
efficiently when they are stored in a database

What is Data?
Data means known raw facts that can be recorded and that have implicit meaning.
For example:
consider the names, telephone numbers, and addresses of the people you know.
Note: The word raw indicates that the facts have not yet been processed to
reveal their meaning.

What is Information?
Information is the result of processing raw data to reveal its meaning. Data processing
can be as simple as organizing data to reveal patterns or as complex as making forecasts or
drawing inferences using statistical modeling.

Why Databases:
Imagine trying to operate a business without knowing who your customers are,
what products you are selling, who is working for you, who owes you money, and whom
you owe money. All businesses have to keep this type of data and much more; and just as
importantly, they must have those data available to decision makers when they need them.
It can be argued that the ultimate purpose of all business information systems is to help
businesses use information as an organizational resource. At the heart of all of these
systems are the collection, storage, aggregation, manipulation, dissemination, and
management of data.

What is a Database? Explain.


A database is a collection of related data. It is collection of large volumes of facts and
figures in an orderly manner. A database has the following implicit meaning
i. A database represents aspects of the real world
ii. It is a logically coherent collection of data with some inherent meaning
iii. A database is designed, built and populated with data for a specific purpose.
It has an intended group of users.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 2


Introduction to DBMS

What is Database Management System (DBMS)?


or
What are the four main types of actions involved in database? Briefly
discuss each.
or
What does defining, constructing, manipulating and sharing of database
mean?
A database management system (DBMS) is a collection of programs that enables users to
create and maintain a database. The DBMS is a general-purpose software system that
facilitates the processes of defining, constructing, manipulating, and sharing databases
among various users and applications.
Defining : A database involves specifying the data types , structures, constraints of the data
to be stored in the database.
Constructing: the database is the process of storing the data on some storage medium that
is controlled by the DBMS.
Manipulating: A database includes functions such as querying the database to retrieve
specific data, updating the database to reflect changes in the mini-world and generating
reports from the data.
Sharing: a database allows multiple users and programs to access the database
simultaneously.

1.2 DATABASE-SYSTEM APPLICATIONS (AN EXAMPLE)

Databases are widely used. Here are some representative applications:


• Enterprise Information
➢ Sales: For customer, product, and purchase information.
➢ Accounting: For payments, receipts, account balances, assets and other
accounting information.
➢ Human resources: For information about employees, salaries, payroll taxes,
and benefits, and for generation of paychecks.
➢ Manufacturing: For management of the supply chain and for tracking
production of items in factories, inventories of items in warehouses and
stores, and orders for items.
➢ Online retailers: For sales data noted above plus online order tracking,
generation of recommendation lists, and maintenance of online product
evaluations.
• Banking and Finance
➢ Banking: For customer information, accounts, loans, and banking
transactions.
➢ Credit card transactions: For purchases on credit cards and generation of
monthly statements.
➢ Finance: For storing information about holdings, sales, and purchases of
Sudarsanan D Assistant Professor, CITECH-ISE. Page 3
Introduction to DBMS

financial instruments such as stocks and bonds; also for storing real-time
market data to enable online trading by customers and automated trading
by the firm
• Universities: For student information, course registrations, and grades (in
addition to standard enterprise information such as human resources and
accounting).
• Airlines: For reservations and schedule information. Airlines were among the
first to use databases in a geographically distributed manner.
• Telecommunication: For keeping records of calls made, generating monthly
bills, maintaining balances on prepaid calling cards, and storing information
about the communication networks

1.3 CHARACTERISTICS OF THE DATABASE APPROACH


Discuss the main characteristics of database approach and how it differ from
traditional file system?

A number of characteristic distinguish the database approach from the traditional


approach of programming with files such as
1) In traditional file processing, each user defines and implements the files needed
for his specific application. Each user maintains separate files which promotes
redundancy, wastage of valuable memory space and inconsistency.
The database approach on the other hand maintains a single repository of data
which can be accessed by various users. Hence it avoids redundancy and
inconsistency.
SIMPLE FILE SYSTEM

Sudarsanan D Assistant Professor, CITECH-ISE. Page 4


Introduction to DBMS

CONTRASTING DATABASE AND FILE SYSTEMS

2) Self-describing nature of the database system


The traditional file processing system does not contain the description of
itself. However, the database approach not only stores the database but also
stores a complete description of the database structure and constraint in a
“catalog”. The information stored in the catalog is referred to as the “meta
data”.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 5


Introduction to DBMS

3) Insulation between Programs and Data,and Data Abstraction

In traditional file processing approach, data definition is a part of the application


program. Hence programs would be able to work with only one specific database.
However in the database approach, data definition is stored in the DBMS catalog
separately from access program. This property is called as “program-data
independence” further application programs can operate on the data by invoking
operations(functions) regardless of how these operations are implemented. This is
termed as “program –operation independence”

This characteristic of the database that allows program-data independence and


program –operation independence is called as data abstraction.

4) Support of Multiple Views of the Data

A traditional file processing approach supports a single view of the data. However a
database approach supports multiple view of the data. Database approach supports
many users each of whom would require a certain view of the database. Hence
DBMS approach provides facilities for defining multiple views.

5) Sharing of data and multi-user transaction

Traditional file processing approach did not support sharing of data. However, the
modern database approach supports sharing of data as well as multi-user
transactions. For this, the DBMS includes features such as concurrency control to
ensure that several users trying to update the same data do so in a controlled
manner. It also enforces isolation property , atomicity property etc,

1.4 ADVANTAGES OF USING THE DBMS APPROACH

What are the advantages of using a DBMS approach? (or) Discuss the capabilities
that must be provided by a DBMS.

i. Controlling Redundancy in data storage This redundancy in storing the same


data multiple times leads to several problems. First, there is the need to perform
a single logical update—such as entering data on a new student—multiple times:
This leads to duplication of effort. Second, storage space is wasted when the same
data is stored repeatedly, and this problem may be serious for large databases.
Files that represent the same data may become inconsistent. This may happen
because an update is applied to some of the files but not to others.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 6


Introduction to DBMS

ii. Restricting unauthorized access to data. When multiple users share a large
database, it is likely that most users will not be authorized to access all
information in the database. for example only authorized persons are allowed to
access the data. In addition, some users may only be permitted to retrieve data,
whereas others are allowed to retrieve and update. A DBMS should provide a
security and authorization subsystem.

iii. Providing Persistent Storage for Program Objects Databases can be used to
provide persistent storage for program objects and data structures. The values
of program variables or objects are discarded once a program terminates, unless
the programmer explicitly stores them in permanent files, which often involves
converting these complex structures into a format suitable for file storage.

iv. The persistent storage of program objects and data structures is an important
function of database systems. Traditional database systems often suffered from the
so called impedance mismatch problem

v. Providing Storage Structures and Search Techniques for Efficient Query


Processing Database systems must provide capabilities for efficiently executing
queries and updates. Because the database is typically stored on disk, the DBMS
must provide specialized data structures and search techniques to speed up disk
search for the desired records. Auxiliary files called indexes are used for this
purpose.

vi. Providing Backup and Recovery A DBMS must provide facilities for recovering
from hardware or software failures. The backup and recovery subsystem of
the DBMS is responsible for recovery.

vii. For example, if the computer system fails in the middle of a complex update
transaction, the recovery subsystem is responsible for making sure that the
database is restored to the state it was in before the transaction started executing.

viii. Providing Multiple User Interfaces Because many types of users with varying
levels of technical knowledge use a database, a DBMS should provide a variety of
user interfaces. forms-style interfaces and menu-driven interfaces are used and
commonly known as graphical user interfaces (GUIs). Many specialized
languages and environments exist for specifying GUIs.

ix. Representing Complex Relationships among Data A database may include


numerous varieties of data that are interrelated in many ways. A DBMS must
have the capability to represent a variety of complex relationships among the
data, to define new relationships as they arise, and to retrieve and update
related data easily and efficiently.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 7


Introduction to DBMS

x. Enforcing Integrity Constraints Most database applications have certain


integrity constraints that must hold for the data. A DBMS should provide
capabilities for defining and enforcing these constraints. The simplest type of
integrity constraint involves specifying a data type for each data item.

1.5 A BRIEF HISTORY OF DATABASE APPLICATIONS

▪ Early Database Applications:


• The Hierarchical and Network Models were introduced in mid 1960s and
dominated during the seventies.
A hierarchical database model is a data model in which the data are
organized into a tree-like structure. The data are stored as records which are
connected to one another through links.

(Hierarchical database model)

Disadvantages

• When a user needs to store a record in a child table that is currently unrelated to any
record in a parent table, it gets difficulty in recording and user must record an additional
entry in the parent table.
• This type of database cannot support complex relationships, and there is also a problem
of redundancy, which can result in producing inaccurate information due to the
inconsistent recording of data at various sites.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 8


Introduction to DBMS

Network model (popular in mainframe computer)

Disadvantages

The disadvantages of network model are as follows:


• Database contains a complex array of pointers.
• System complexity limits efficiency.
• Structural changes require changes in all application programs.
• Navigation systems yield complex implementation and management.
• Keep heavy pressure on programmers due to the complex structure.
• Any change like updating, deletion, insertion is very complex.

Relational Model based Systems:


• Relational model was originally introduced in 1970, was heavily researched
and experimented with in IBM Research and several universities
• Object-oriented and emerging applications:
Object-Oriented Database Management Systems (OODBMSs) were
introduced in late 1980s and early 1990s to cater to the need of complex data
processing in CAD and other applications.
▪ Their use has not taken off much.
Many relational DBMSs have incorporated object database concepts, leading to a
new category called object-relational DBMSs (ORDBMSs)
Extended relational systems add further capabilities (e.g. for multimedia data, XML,
and other data types)
Relational DBMS Products emerged in the 1980s
▪ Data on the Web and E-commerce Applications:
▪ Web contains data in HTML (Hypertext markup language) with links among pages.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 9


Introduction to DBMS

▪ This has given rise to a new set of applications and E-commerce is using new
standards like XML (eXtended Markup Language).
▪ Script programming languages such as PHP and JavaScript allow generation of
dynamic Web pages that are partially generated from a database
▪ New functionality is being added to DBMSs in the following areas:
▪ Scientific Applications
▪ XML (eXtensible Markup Language)
▪ Image Storage and Management
▪ Audio and Video data management
▪ Data Warehousing and Data Mining
▪ Spatial data management
▪ Time Series and Historical Data Management
▪ The above gives rise to new research and development in incorporating new data
types, complex data structures, new operations and storage and indexing schemes
in database systems.
▪ Also allow database updates through Web pages

1.6 NEXT GENERATION DATABASES (GRAPH DB ,VECTOR DB)

Vector Database (Vector DB)

Purpose

A Vector Database (Vector DB) is optimized for storing and querying high-dimensional
vector embeddings generated from different data types, such as text, images, or numerical
data, using models like BERT, Sentence Transformers, or CLIP.

Usage in RAG

In a RAG setup, a query (such as a text input) is transformed into a vector representation
using an encoder model. This vector is then matched against pre-stored vectors within the
Sudarsanan D Assistant Professor, CITECH-ISE. Page 10
Introduction to DBMS

database to identify the most relevant documents or pieces of information. These retrieved
documents are subsequently fed into the generative model to craft a response.

Key Components

• Vector Storage: Efficiently stores high-dimensional vectors, enabling rapid access.

• Similarity Search: Utilizes fast nearest neighbor search algorithms like


Approximate Nearest Neighbor (ANN) for quick and efficient retrieval of relevant
data.

• Indexing Structures: Employs indexing mechanisms such as Hierarchical Navigable


Small World (HNSW) or Inverted File (IVF) to optimize the search process, ensuring
swift retrieval of the most relevant vectors.

Benefits

• Scalability: Can handle millions to billions of vectors efficiently, making it suitable


for large-scale applications.

• Speed: Optimized for real-time applications with fast similarity search capabilities.

• Precision: High accuracy in retrieving semantically similar data, enhancing the


quality of the generated responses.

Drawbacks

• Complexity: Requires specialized knowledge to fine-tune vector encoding and


indexing for optimal performance.

• Limited Contextual Relationships: Primarily focuses on similarity matching,


lacking the ability to understand complex relationships between data points beyond
their vector space proximity.

Graph Database (Graph DB)

Purpose

Graph Databases (Graph DB) are designed to manage data in terms of entities (nodes) and
relationships (edges) between these entities. This design is ideal for applications where
understanding relationships between data points is crucial.

Usage in RAG

In a RAG application, a Graph DB can retrieve contextually relevant information based on


relationships between entities. For example, if a query involves a specific entity, the Graph

Sudarsanan D Assistant Professor, CITECH-ISE. Page 11


Introduction to DBMS

DB can retrieve not just the entity itself but also its related entities, providing more
comprehensive context for the generative model.

Key Components

• Nodes: Represent entities like documents, concepts, or individuals.

• Edges: Represent the relationships between entities, such as "is author of," "related
to," or "belongs to."

• Traversal Algorithms: Employ efficient graph traversal techniques to explore and


retrieve data connected through relationships.

Benefits

• Contextual Richness: Offers richer context by retrieving data based on complex


relationships, improving the depth and relevance of generated content.

• Flexibility: Allows the schema to evolve easily, enabling the addition of new
relationship types without restructuring the database.

• Insight into Relationships: Provides a deeper understanding of data by leveraging


the relationships between data points, useful in domains with highly relational
contexts like social networks or knowledge graphs.

Drawbacks

• Performance: Graph traversal can be slower than vector similarity search,


especially for large and complex graphs.

• Scalability: While graph databases can scale, they might struggle with performance
in very large datasets compared to Vector DBs optimized for large-scale similarity
searches.

• Complexity in Querying: Requires a more intricate understanding of graph theory


for effective querying and retrieval.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 12


Introduction to DBMS

Comparing Vector DB and Graph DB in RAG Applications


Vector DB Advantages

• Speed and Scalability: Ideal for real-time applications requiring quick, similarity-based
retrieval across large datasets.

• Precision: Excellent for retrieving data based on high-dimensional vector similarity,


ensuring accurate and relevant responses.

Graph DB Advantages

• Rich Contextual Understanding: Retrieves data by understanding complex relationships,


offering more contextually rich information to the generative model.

• Flexibility and Evolution: Adaptable to changing data structures, making it easier to


incorporate new types of relationships over time.

Use Cases

• Vector DB is ideal for applications like:Semantic Search: Where high-dimensional


similarity matching is essential.Image and Text Retrieval: When the goal is to find items
that are semantically similar to the query.

• Graph DB is better suited for applications like:Knowledge Graphs: Where


understanding relationships between data points is crucial.Social Networks and
Recommendation Systems: Where insights into complex relationships between entities
add significant value.

Combining Both Databases

In many RAG applications, employing both Vector and Graph DBs can be the most effective
approach. This hybrid method allows the system to leverage the strengths of each database
type, providing both precise similarity matching and rich contextual understanding.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 13


Introduction to DBMS

OVERVIEW OF DATABASES LANGUAGES AND ARCHITECHTURE

1.7 DATA MODELS, SCHEMAS, AND INSTANCES

A data model—A data model is a relatively simple representation, usually graphical, of


more complex real-world data structures.
(or)
A data model represents data structures and their characteristics,
relations, constraints, transformations, and other constructs with the purpose of
supporting a specific problem domain.

Categories of Data Models


Discuss the main categories of data models.
Many data models have been proposed, which we can categorize according to the
types of concepts they use to describe the database structure.

High-level or conceptual data models provide concepts that are close to the way
many users perceive data,

Low-level or physical data models provide concepts that describe the details of
how data is stored on the computer storage.
These two extremes is a class of representational (or implementation) data
models, which provide concepts that may be easily understood by end users.

Database Schema and Database State


What is the difference between a database schema and a database state ?
The description of a database is called the “database schema”. It is specified during
the database design phase and in not expected to change frequently.
A pictorial representation of the database schema is called as the schema diagram.
Schema Diagram: An illustrative display of (most aspects of) a database schema.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 14


Introduction to DBMS

Schema Construct: A component of the schema or an object within the schema, e.g.,
STUDENT, COURSE.

Database State: The actual data present in the database at any particular point of time is
called as a database state (or snapshot or occurrences or instances). The database state
(actual data) may change from time to time frequently.

Database State: Refers to the content of a database at a moment in time.

Initial Database State: Refers to the database state when it is initially populated with data
into the system.

Valid State: A state that satisfies the structure and constraints of the database.

The database schema is sometimes called as the “intension” and a database state is called
an “extension” of the schema.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 15


Introduction to DBMS

1.8 THREE-SCHEMA ARCHITECTURE AND DATA INDEPENDENCE


The goal of the three-schema architecture, illustrated in Figure is to separate the user
applications from the physical database.

Create table EMP


(
Emp_No int(15) primary key,
First_Name varchar(20),
Last_Name varchar(20),
Dept.num varchar(10)
)

1. The external or view level includes a


number of external schemas or user views. Each external schema describes the
part of the database that a particular user group is interested in and hides the rest of
the database from that user group.

2. The conceptual level has a conceptual schema, which describes the structure of
the whole database for a community of users. The conceptual schema hides the
details of physical storage structures and concentrates on describing entities, data
types, relationships, user operations, and constraints.

3. The internal level has an internal schema, which describes the physical storage
structure of the database. The internal schema uses a physical data model and
describes the complete details of data storage and access paths for the database.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 16


Introduction to DBMS

1.9 DATA INDEPENDENCE

What is the difference between logical data independence and physical data independence
which one is harder to achieve? Why?
Three-schema architecture can be used to achieve both logical data independence and
physical data independence.
1. Logical data independence
2. Physical data independence

1. Logical data independence is the capacity to change the conceptual schema without
having to change external schemas or application programs. We may change the conceptual
schema to expand the database (by adding a record type or data item), to change
constraints, or to reduce the database (by removing a record type or data item).
2. Physical data independence is the capacity to change the internal schema without
having to change the conceptual schema. Hence, the external schemas need not be changed
as well. Changes to the internal schema may be needed because some physical files were
reorganized -for example, by creating additional access structures—to improve the
performance of retrieval or update. If the same data as before remains in the database, we
should not have to change the conceptual schema.

1.10 DATABASE LANGUAGES AND INTERFACES


Write a note on the different DBMS languages.
The DBMS must provide appropriate languages and interfaces for each Category of users.
DBMS Languages
• Data Definition Language (DDL):
• Storage Definition Language (SDL)
• View Definition Language (VDL)
• Data Manipulation Language (DML)

Data definition language (DDL): Used by the DBA and database designers to specify the
conceptual schema of a database. The DBMS will have a DDL compiler whose function is to
process DDL statements in order to identify descriptions of the schema constructs and to
store the schema description in the DBMS catalog.
Storage definition language (SDL), is used to specify the internal schema. The mappings
between the two schemas may be specified in either one of these languages.
View definition language (VDL),to specify user views and their mappings to the
conceptual schema, but in most DBMSs the DDL is used to define both conceptual and
external schemas.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 17


Introduction to DBMS

Data Manipulation Language (DML), Used to specify database retrievals and


updates.DML commands (data sublanguage) can be embedded in a general-purpose
programming language (host language), such as COBOL, C, C++, or Java.

DBMS Interfaces:
Discuss the different types of user friendly interfaces and the types of users who
typically use each.
Many user friendly interfaces are provided by the DBMS to enable the user to interact with
the data in the database such as

Menu-Based Interfaces: These interfaces present the user with lists of options (called
menus) that help the user to make a request. The advantage of this is that the user need
not memorize the specific commands and syntax.

Forms-Based Interfaces: A forms-based interface displays a form to each user. Users can
fill out all of the form entries to insert new data onto the database. Forms are usually
designed and programmed for naive users.

Graphical User Interface: Present a pictorial form of the schema. The user can then use a
pointing device(such as a mouse) to make a choice out of the many options provided by the
GUI.

Natural Language Interfaces: These interfaces accept requests written in English or some
other language and attempt to understand them. A natural language interface would have a
dictionary of important words. If the interpretation is successful, it generate a high level
query. Otherwise, a dialogue is started with the user to clarify the request.

Speech Input and Output: Limited use of speech as an input query and speech as an
answer to a question or result of a request is becoming commonplace. Applications with
limited vocabularies such as inquiries for telephone directory, flight arrival/departure, and
credit card account information are allowing speech for input and output to enable
customers to access this information.

Interfaces for Parametric Users: Parametric users, such as bank tellers, often have a
small set of operations that they must perform repeatedly.

Interfaces for the DBA: Most database systems contain privileged commands that can be
used only by the DBA staff. These include commands for creating accounts, setting system
parameters, granting account authorization, changing a schema, and reorganizing the
storage structures of a database.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 18


Introduction to DBMS

1.11 THE DATABASE SYSTEM ENVIRONMENT/TYPICAL COMPONENTS


OF DBMS MODULE AND INTERACTIONS
What other computer system software does a DBMS interact with? With a neat diagram,
explain the component modules of a DBMS and their interaction.

DBMS Component Modules:

Fig: Component modules of a DBMS and their interactions

A DBMS is a complex software system. The types of software components that constitute a
DBMS and the types of computer system software with which the DBMS interacts.

The figure is divided into two parts. The top part of the figure refers to the various users of
the database environment and their interfaces. The lower part shows the internals of the
DBMS responsible for storage of data and processing of transactions. The database and the
DBMS catalog are usually stored on disk. Access to the disk is controlled primarily by the
operating system (OS).

Many DBMSs have their own buffer management module to schedule disk

Sudarsanan D Assistant Professor, CITECH-ISE. Page 19


Introduction to DBMS

Read/write, because this has a considerable effect on performance. top part of Figure
shows interfaces for the DBA staff, casual users who work with interactive interfaces to
formulate queries, application programmers who create programs using some host
programming languages, and parametric users who do data entry work by supplying
parameters to predefined transactions.

The DBA staff works on defining the database and tuning it by making changes to its
definition using the DDL and other privileged commands. The queries are parsed and
validated for correctness of the query syntax, the names of files and data elements, and so
on by a query compiler that compiles them into an internal form. the query optimizer is
concerned with the rearrangement and possible reordering of operations, elimination of
redundancies, and use of correct algorithms and indexes during execution. The pre
compiler extracts DML commands from an application program written in a host
programming language. We have shown concurrency control and backup and recovery
systems separately as a module in this figure.

The DBMS interacts with the operating system when disk accesses—to the database or to
the catalog—are needed. If the computer system is shared by many users, the OS will
schedule DBMS disk access requests and DBMS processing along with other processes. On
the other hand, if the computer system is mainly dedicated to running the database server,
the DBMS will control main memory buffering of disk pages.

Database System Utilities:


What are database utilities? List a few common functions that the utilities perform.

Database utilities refer to additional facilities that help the DBA to manage the database
system. Some of the common utilities are-

i. Loading: . A loading utility is used to load existing data files—such as text files or
sequential files—into the database. Usually, the current (source) format of the data
file and the desired (target) database file structure are specified to the utility, which
then automatically reformats the data and stores it in the database.

ii. Backup: A backup utility creates a backup copy of the database, usually by dumping
the entire database onto tape or other mass storage medium. The backup copy can
be used to restore the database in case of catastrophic disk failure.

iii. Database storage re-organization: This utility can be used to reorganize a set of
database files into different file organizations and create new access paths to
improve performance.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 20


Introduction to DBMS

iv. Performance monitoring: database usage and provides statistics to the DBA. The
DBA uses the statistics in making decisions such as whether or not to reorganize
files or whether to add or drop indexes to improve performance

Tools, Application Environments, and Communications Facilities


Tools: Other tools are often available to database designers, users, and the DBMS. CASE
tools12 are used in the design phase of database systems. Another tool that can be quite
useful in large organizations is an expanded data dictionary (or data repository) system.

Application such as PowerBuilder (Sybase) or JBuilder (Borland), xamp have been quite
popular. These systems provide an environment for developing database applications and
include facilities that help in many facets of database systems, including database design,
GUI development, querying and updating, and application program development.

communications software: The DBMS also needs to interface with communications


software, whose function is to allow users at locations remote from the database system
site to access the database through computer terminals, workstations, or personal
computers. These are connected to the database site through data communications
hardware such as Internet routers, phone lines, long-haul networks, local networks, or
satellite communication devices. Many commercial database systems have communication
packages that work with the DBMS.

CONCEPTUAL DATA MODELLING USING ENTITIES AND RELATIONSHIPS


1.11 ENTITIES, ENTITY TYPES, ENTITY SETS, ATTRIBUTES, AND KEYS

ENTITIES and ATTRIBUTES : are the basic objects of an ER-MODEL. Entity represents a
“THINGS’ in the real world which has an independent existence.
Each entity has attributes. Attributes are properties that more fully describe an entity.

Eg: the EMPLOYEE entity would be described by the name, age, address, salary, sex
etc. which become the attributes of the entity.

Types of Attributes:
What are the different type of attributes. Explain.
1. Composite Attributes
2. Simple (Atomic) Attributes

Sudarsanan D Assistant Professor, CITECH-ISE. Page 21


Introduction to DBMS

3. Single-Valued Attributes
4. Multi valued Attributes
5. Stored Attributes
6. Derived Attributes
7. Complex Attributes

1. Composite attributes are such attributes which can be divided into smaller sub-parts.
These sub-parts would represent more basic attributes.
For example, the address attribute can be further divided into street no. city, state, zipcode
etc.

2. Simple/Atomic Attributes: Attributes that cannot be further subdivided are called as


simple or atomic attributes.
Ex: the sex attribute cannot be further subdivided and hence is an atomic attribute.

3. Single valued attributes: Most attributes have a single value for a particular entity;
such attributes are called single-valued.
Ex: Age is a single-valued attribute of a person

4. Multi valued attributes: Most attributes have a multi-value for the same property; such
attributes are called Multivalued. Ex: color : {red, blue} ,phone_no

5. Derived attribute: In some cases, the value of one attribute can be obtained using the
value of another attribute.
Ex: AGE attribute can be derived by subtracting the date of DOB from the current
DATE

6. Stored attribute: the attribute that cannot be obtained using the value of another
attribute is called as the stored attribute. or entered directly to relative attribute entities.
Ex: date of birth attribute is the stored attribute.
7. Complex Attributes: this attribute in general, composite and multivalued attributes can
be nested arbitrarily. We can represent arbitrary nesting by grouping components of a
composite attribute between parentheses ( ) and separating the components with commas,
and by displaying multivalued attributes between braces { }. Such attributes are called
complex attributes.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 22


Introduction to DBMS

Eg: A complex attribute: Address_phone


{Address_phone({Phone(Area_code,Phone_number)},Address(Street_address
(Number,Street,Apartment_number),City,State,Zip) )}
Both Phone and Address are themselves composite attributes.

What is NULL value?


In some cases, a particular entity may not have an applicable value for an attribute.
For example, the college-degree attribute is applicable only to such employee who are
educated up to the college level. For such situations, a special value “NULL” value must be
used.

ENTITY TYPES:
Define the terms entity types and entity set.
Key Attributes of an Entity Type. An important constraint on the entities of an entity type
is the key or uniqueness constraint on attributes. An entity type usually has one or more
attributes whose values are distinct for each individual entity in the entity set. Such an
attribute is called a key attribute, and its values can be used to identify each entity
uniquely. Each key attribute has its name underlined inside the oval

ENTITY SET: A entity set is a set of entities of the same type that share the same
properties or attributes
ENTITY TYPE
OR
Collection of entity is called entity set.
Ex:

Sudarsanan D Assistant Professor, CITECH-ISE. Page 23


Introduction to DBMS

ENTITY
SET

VALUE SETS (Domains) of Attributes. Each simple attribute of an entity type is


associated with a value set (or domain of values), which specifies the set of values that
may be assigned to that attribute for each individual entity .
Ex: if the range of ages allowed for employees is between 16 and 70, we can specify the
value set of the Age attribute of EMPLOYEE to be the set of integer numbers between 16
and 70.

INITIAL CONCEPTUAL DESIGN OF THE COMPANY DATABASE

1. Identifying all entity sets


2. Identifying attributes with all entity sets (aware of different attributes)
3. Identifying feasible relationship terms
4. Identifying cardinality ratios
5. Identifying participating constraints
6. Identifying participating roles(if any)

Entity types for the COMPANY database.

We can identify four entity types—one Corresponding to each of the four items in the
specification
1. An entity type DEPARTMENT with attributes Name, Number, Locations,
Manager, and Manager_start_date. Locations is the only multivalued
attribute.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 24


Introduction to DBMS

2. An entity type PROJECT with attributes Name, Number, Location, and


Controlling department. Both Name and Number are (separate) key
attributes.

3. An entity type EMPLOYEE with attributes Name, Ssn, Sex, Address,


Salary,Birth_date, Department, and Supervisor. Both Name and Address may
be composite attributes. components of Name—First_name, Middle_initial,
Last_name—or of Address.

4. An entity type DEPENDENT with attributes Employee, Dependent_name,


Sex, Birth_date, and Relationship (to the employee).

Sudarsanan D Assistant Professor, CITECH-ISE. Page 25


Introduction to DBMS

1.12 RELATIONSHIP TYPES, RELATIONSHIP SETS, ROLES, AND


STRUCTURAL CONSTRAINTS

What is meant by Relationship Type and Relationship Sets?

RELATIONSHIPS: A relationship relates two or more distinct entities with a specific


meaning OR is an association among entities.
Entity does not exists in isolation
Ex: EMPLOYEE John works on the Pro-X PROJECT,

What is meant by Degree of a Relationship type?


The degree of a relationship type is the number of participating entity type.
If a Unary/Recursive Relationship: when an association maintained with a single entity

BINAY RELATIONSHIP: maintained with two entities

Sudarsanan D Assistant Professor, CITECH-ISE. Page 26


Introduction to DBMS

TERNARY RELATIONSHIP: maintained with three entities

Note: although higher exixts, they are not specifically named

RELATIONSHIP TYPES: A Relationship type R among n entity types E1,E2,E3…….En,


defines a set of associations among entities..

RELATIONSHIP SETS: R is a set of relationships instances ri where each ri associates n


individual entities.
Example:

The above fig shows


Some instances in the WORKS-FOR realtioshp set which represents a relationship type
betweeen EMPLOYEE and DEPARTMENBT.

ROLES

Sudarsanan D Assistant Professor, CITECH-ISE. Page 27


Introduction to DBMS

What is a participation role? When it necessary to use role names in the description of
relationship types?
The role name signifies the role that the participating entities play in each relationship. For
example consider the EMPLOYEE and DEPARTMENT entities as given below-

In the above example, the role name is works for and it signifies that the employee works
for the particular department. However role names are not compulsorily required when
the participating entities are distinct. However, in some cases where the participating
entities are same , role name becomes essential for distinguishing the meaning of each
participating the meaning of each participation. Such relationship are called as “
RECURSIVE RELATIONSHIP”

EMPLOYEE SUPERVISION

Int the above example, the EMPLOYEE entity participates twice in SUPERVISION. i.e once
in the role of a supervisior and next in the role of a supervisee. Such relationships are called
as Recursive relationships in which the role names becomes very essential.

CONSTRAINTS

There are two types of relationship constraints


1.Cardinality Ratio
2.Participation Constraint

CARDINALITY RATIO:
The cardinality ratio of a binary relationship specifies the maximum number of
relationship instances that an entity can participate in the possible cardinality ratios
for binary relationship types are
1. One to One (1:1)
2. One to Many (1:N)
3. Many to One(N:1)
4. Many to Many (M:N)

Sudarsanan D Assistant Professor, CITECH-ISE. Page 28


Introduction to DBMS

▪ ONE TO ONE (1:1):An example of 1:1 binary relationship is MANAGES, which


relates a department entity to the EMPLOYEE who manages the department. This
represents the constraint that at any point in time, an employee can manages one
department only and a department can have one manager only

Ex: employee manages department

▪ Many to one (N:1 ):binary relationship is the WORKS-FOR, which relates a


DEPARMENT entity to EMPLOYEE entity. This represents the constraints that at any
point in time, a DEPARTMENT may have many employees but an EMPLOYEE works-
for only one department.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 29


Introduction to DBMS

Ex: employee worksfor department

Many to many (M:N):An example of M:N binary relationship is WORKS-ON which relates
the EMPLOYEE entity to PROJECT entity. This represents the constraint that at any point in
time an employee may work on more than one PROJECT and that a PROJECT also can have
more than one EMPLOYEE.

Ex: many Employees works on many project

PARTICIPATING CONSTRAINTS

What is meant by participation constraints? Explain.


Participation constraint specifies whether the existence of an entity depends on its being
related to another entity via a relationship type, there are 2 types of participation
constraints namely
1. Total participation (existence Dependency)
2. Partial participation

TOTAL PARTICIPATION: if the company policy states that every employee must work for
a department, then an employee entity can exist if it participates in the WORKS-FOR
relationship. Thus , the participation of EMPLOYEE in WORKS-FOR is called TOTAL
PARTICIPATION(which is also called as existence dependency). Total participation is
represented by double lines in an ER-DIAGRAM
Ex: PROFESSOR Teaches CLASS

EMPLOYEE WORKS DEPARTMENT


Sudarsanan D Assistant Professor, CITECH-ISE. FOR Page 30
Introduction to DBMS

PARTIAL PARTICIPATION: We do not expect every EMPLOYEE to manage a department


and hence the participation if EMPLOYEE in MANAGES relationship is partial. Partial
participation is represented by single line an ER DIAGRAM.

EMPLOYEE MANAGE DEPARTMENT


S

Partial participation

NOTE:
Write a note on the structural constraints of relationship types.
If this question is asked in the exams, then discuss about both i) cardinality ratio ii)
participation constraints.

1.13 WEAK ENTITY TYPES

When is the concept of a weak entity used in data modeling? Define the terms owner
entity type, weak entity type, identifying relationship type and partial key.

A weak entity is one that meets two conditions:


1. The entity is existence-dependent; that is, it cannot exist without the entity with
which it has a relationship.
2. The entity has a primary key that is partially or totally derived from the parent
entity in the relationship.

Entities that do not have key attributes of their own are called as “weak entities”. On the
other hand, strong entities are such entities which have a key of their own.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 31


Introduction to DBMS

A weak entity is identified through another strong entity in combination with one of its
attribute such a strong entity is called as the “identifying or owner entity type”. The
relationship type that relates a weak entity type to its owner is called as the “identifying
relationship”.

A weak entity normally has a “ partial key” which is an attribute(or set of attributes) that
can uniquely identify weak entities that are related to some owner entity.

In ER- Diagrams, both a weak entity type and its identifying relationship are distinguished
by surrounding their boxes and diamonds with double lines. Further the partial key
attribute is underlined with a dashed line.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 32


Introduction to DBMS

Consider following example

e_ph
e_sex
date
e_name

e_dob month
ssn
year

EMPLOYEE

has

DEPENDENT

sex

name dob relation

date month year

Sudarsanan D Assistant Professor, CITECH-ISE. Page 33


Introduction to DBMS

1.14 SUMMARY OF NOTATIONS FOR ER-DIAGRAMS


List the summary of notations for ER-diagrams also discuss the naming convention used for
ER scheme diagram

Sudarsanan D Assistant Professor, CITECH-ISE. Page 34


Introduction to DBMS

The naming conventions used in the ER schema diagram are-


i. One should choose names that convey as much as possible the meanings attached to
them in the ER-schema.
ii. Normally singular names are chosen for entities rather than plural ones.
iii. Normally entities and relationship names are in upper case letters where as
attributes names are initial letter capitalized. Role names would be in lower case.
iv. As a general practice, given a narrative description of the database requirements,
the nouns appearing in the narrative tend to give rise to entities where as verbs
tend to indicate relationships.
v. Another naming consideration involves choosing binary relationship names to make
the ER diagram of the schema readable from left to right and from top to bottom.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 35


Introduction to DBMS

1.15 ER DIAGRAMS

MOVIE DATABASE

COMPANY DATABSE/EMPLOYEE DATABASE

Sudarsanan D Assistant Professor, CITECH-ISE. Page 36


Introduction to DBMS

AIR LINE DATABASE


Sudarsanan D Assistant Professor, CITECH-ISE. Page 37
Introduction to DBMS

Sudarsanan D Assistant Professor, CITECH-ISE. Page 38


Introduction to DBMS

BANKING DATABASE

Sudarsanan D Assistant Professor, CITECH-ISE. Page 39


Introduction to DBMS

MUSIC DATABASE

Sudarsanan D Assistant Professor, CITECH-ISE. Page 40


Introduction to DBMS

LIBRARY MANAGEMENT DATABASE

Sudarsanan D Assistant Professor, CITECH-ISE. Page 41


Introduction to DBMS

HOSPITAL MANAGENET DATABASE

Sudarsanan D Assistant Professor, CITECH-ISE. Page 42

You might also like