0% found this document useful (0 votes)
17 views

DBMS Module 3.3 PDF

Uploaded by

Nitya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

DBMS Module 3.3 PDF

Uploaded by

Nitya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

DATABASE MANAGEMENT SYSTEM

MCA 104
Module-3
3.3 Physical Database Design

Physical database design is a crucial phase in the process of crea ng a


database management system (DBMS). It involves conver ng the logical
design, which defines the database's structure and rela onships, into a
physical implementa on on the actual storage media, such as hard disks
or solid-state drives. The primary goal of physical database design is to
op mize performance, storage efficiency, and data retrieval speed while
ensuring data integrity and security.

Physical Data Model

In database management systems (DBMS), the physical data model


represents the actual implementa on of the database on the physical
storage media, such as disks, tapes, or solid-state drives. It defines how
the data will be physically stored and organized to achieve op mal
performance and efficiency. The physical data model is built upon the
logical data model, which defines the data structure, rela onships, and
constraints in a database, but it goes a step further by transla ng these
logical structures into physical storage details.

Aspects of the physical data model in DBMS include:

 File Structures: The physical data model determines the file


structures used to store the data on the storage media. Common
file structures include sequen al files, indexed sequen al files,
hashed files, and B-trees. The choice of file structure depends on
the data access pa erns and the types of queries expected to be
executed on the database.

 Indexing: Indexing is an essen al component of the physical data


model, allowing for faster data retrieval. Indexes are data structures
that provide pointers to the actual data rows based on specific
a ributes. They speed up search opera ons and enhance query
performance by reducing the number of disk reads.

 Data Par oning: Data par oning involves dividing the database
into smaller, manageable segments or par ons. Par oning can
be done based on a ributes like date ranges or geographical
loca ons. This technique improves parallelism, reduces conten on,
and enhances performance in mul -processor systems.

 Clustering: Clustering involves physically organizing related data


together on the storage media. By clustering data that is frequently
accessed together, the number of disk accesses can be minimized,
leading to improved data retrieval speed.

 Denormaliza on: In certain cases, denormaliza on may be


employed in the physical data model to reduce the need for joins
and improve query performance. Denormaliza on involves
redundantly storing some data to simplify and speed up common
queries.

 Data Compression: Data compression techniques may be applied in


the physical data model to reduce storage requirements.
Compression reduces the disk space needed to store data, resul ng
in cost savings and poten ally improved I/O performance.

 Data Replica on: Replica on involves crea ng mul ple copies of


the database on different physical storage devices. Replica on
enhances fault tolerance and improves data availability.

 Data Archiving: Data archiving is the process of moving infrequently


accessed or historical data to separate storage to free up space in
the ac ve database and improve performance.

 Physical Security and Backup: The physical data model also


addresses security and backup strategies to protect the data from
unauthorized access, corrup on, or loss. This includes defining
access controls, encryp on, and disaster recovery plans.

The physical data model plays a cri cal role in transla ng the logical data
model into an efficient and op mized database structure. By carefully
considering file structures, indexing, data par oning, and other physical
implementa on details, the physical data model ensures that the
database operates efficiently, meets performance requirements, and
remains secure and reliable.

Objec ves of Physical Database Design:

 Performance Op miza on: One of the main objec ves of physical


database design is to enhance the performance of the DBMS. This
involves organizing data on storage media in a way that minimizes
disk seek mes and maximizes data retrieval speed. Properly
chosen indexing, clustering, and par oning strategies can
significantly improve the efficiency of data access opera ons.

 Storage Efficiency: Physical database design aims to make the most


efficient use of available storage space. Techniques like data
compression and appropriate data organiza on can reduce the
amount of disk space required for storing data, leading to cost
savings and be er resource u liza on.

 Scalability: A well-designed physical database should be able to


scale as the data volume and user load grow over me. Proper
par oning and distribu on of data can support horizontal scaling,
allowing the database to handle increasing workloads without
sacrificing performance.

 Data Integrity and Security: Ensuring data integrity and security is a


cri cal aspect of physical database design. Access controls,
encryp on, and other security measures are implemented to
protect sensi ve data from unauthorized access and maintain its
consistency.

 Backup and Recovery: Physical database design includes planning


for reliable backup and recovery mechanisms. Regular backups and
well-defined recovery procedures help protect against data loss
due to hardware failures, so ware errors, or other unforeseen
events.

 Maintenance and Administra on: A well-organized physical


database design simplifies database maintenance tasks. It should
allow for straigh orward data maintenance opera ons, such as
adding or removing data, without significant disrup ons to the
system.

 Data Distribu on: For distributed databases that span mul ple
loca ons, physical database design involves deciding how data will
be distributed across different nodes or servers. Proper data
distribu on ensures op mal data access and query performance
across the distributed environment.

Considera ons in Physical Database Design:

 Data File Organiza on: Choosing the appropriate file organiza on


method (e.g., heap files, indexed files, hashed files) based on the
access pa erns and types of queries expected in the applica on.

 Indexing Strategy: Iden fying the right a ributes to be indexed and


selec ng suitable index structures (e.g., B-trees, hash indexes) to
speed up data retrieval.

 Data Par oning: Deciding how to par on large tables into


smaller segments to enhance parallel processing and distribu on
across mul ple storage devices.

 Clustering: Placing related data together physically on disk to


reduce the need for disk seeks during data retrieval opera ons.

 Denormaliza on: In some cases, denormaliza on may be


considered to improve performance by reducing the need for
complex joins.

Physical database design is a cri cal step in the database development


process. It involves transla ng the logical database model into an
efficient and well-organized physical implementa on that meets
performance, scalability, and security requirements. A well-designed
physical database ensures smooth data management, fast data access,
and reliable data storage for applica ons relying on the database system.

Physical Database and Logical Database Design:

Aspect Physical Database Logical Database


Design Design

Defini on The physical database The logical database


design is the process design is the process
of conver ng the of defining the
logical database conceptual data
model into an actual model that represents
database structure the structure,
that can be rela onships, and
implemented on constraints of the
hardware and storage data, independent of
media. the implementa on
details.

Scope Concerned with the Concerned with the


implementa on of high-level
the database on representa on of
hardware and storage data and its logical
media. organiza on.

Focus Focuses on physical Focuses on data


storage, data access, modelling,
and retrieval rela onships, and
efficiency. data seman cs.

Design Elements Deals with data files, Involves data en es,


indexes, data a ributes,
par oning, data rela onships, primary
compression, keys, foreign keys, and
clustering, backup other logical data
and recovery structures.
strategies, and other
storage-related
considera ons.

Concerns Performance Data integrity, data


op miza on, storage dependencies,
efficiency, data normaliza on, and
security, and disaster user requirements.
recovery.

Hardware Dependent on the Independent of the


Dependency hardware and storage specific hardware or
medium of the storage medium.
underlying system.

Data Manipula on Focuses on how data Focuses on what data


is stored, accessed, is stored, its
and retrieved from rela onships, and its
the physical storage meaning to the users.
media.
Visibility to Users Not directly visible to Provides a conceptual
end-users. Users view to end-users
interact with the through data models
database through and schema
applica ons and defini ons
queries without
knowledge of the
physical storage
structure.

Abstrac on Level Low-level abstrac on High-level abstrac on


with a focus on with a focus on data
implementa on organiza on and
details. seman cs.

Examples of Ar facts Data files, indexes, En ty-rela onship


tables, disk par ons, diagrams (ERDs), data
backup scripts. dic onaries, data flow
diagrams.

In summary, physical database design deals with the concrete


implementa on of the database on hardware and storage, op mizing for
performance and efficiency. It addresses issues such as storage structure,
indexing, and data par oning. Logical database design, on the other
hand, focuses on the conceptual representa on of the database,
independent of physical implementa on details. It deals with data
modelling, rela onships, and data seman cs, ensuring data integrity and
mee ng user requirements. Both design phases are crucial in the
database development process, and they complement each other to
create a well-designed and efficient database system.
7.1 File Structures

In physical database design, various file structures are employed to


op mize data storage and retrieval performance. Each file structure has
its advantages, disadvantages, features, and applica ons, depending on
specific use cases and requirements. Let's examine some common file
structures used in physical database design:

Heap File Structure:

In the context of database management systems (DBMS), a heap file


structure refers to a method of organizing and storing data on disk
without imposing any par cular order on the records. It is a basic and
straigh orward way to store data when the primary concern is efficient
inser on and dele on of records rather than fast retrieval based on
specific key values.

Fig- Heap file Structure

Advantages:

 Simple to implement.
 Efficient for inser ons of new records as they can be appended at
the end of the file.

Disadvantages:

 Searching for specific records can be inefficient as the en re file


must be scanned.

 Dele on and upda ng opera ons can be me-consuming as there


is no specific order of records.

Features:

 Records are stored sequen ally in the order they are inserted.

 No par cular sor ng or indexing is used.

Applica ons:

 Suitable for temporary storage or small datasets where frequent


updates are not expected.

B-Tree and B+Tree File Structures:

B-Tree and B+Tree are two types of balanced tree data structures used in
database management systems to efficiently organize and index large
amounts of data on disk. They are commonly employed for indexing in
database systems, file systems, and other applica ons where fast data
retrieval based on key values is crucial.

B-Tree:

A B-Tree is a self-balancing tree data structure that maintains data in a


sorted order. It is designed to keep the data balanced by ensuring that all
leaf nodes are at the same level. B-Trees are par cularly useful in
scenarios where the data is frequently updated and stored on disk.

Fig- B- tree Structure

B+Tree:

A B+Tree is an extension of the B-Tree data structure with some


addi onal proper es op mized for disk storage systems. B+Trees are
commonly used in databases and file systems to create efficient indexing
structures.

Fig- B+ tree file structure

Advantages:

 Efficient for both searching and inser ng records.

 Balanced tree structure ensures consistent performance even with


large datasets.

 Suitable for range queries due to their ordered nature.


Disadvantages:

 Complex to implement compared to simple file structures.

 Can lead to fragmenta on if not managed properly.

Features:

 Balanced tree structure with variable numbers of keys and children


per node.

 B+Tree nodes only contain keys in non-leaf nodes, improving


performance.

Applica ons:

 Commonly used as index structures for databases to improve data


retrieval speed.

Hash File Structure:In database management systems (DBMS), a hash file


structure is a method of organizing and storing data on disk that uses a
hash func on to determine the loca on (address) of a record within the
file. The primary goal of a hash file structure is to provide fast and
efficient data retrieval based on a unique key value.
Fig- Hash file structure

Advantages:

 Provides direct access to records based on key values, resul ng in


fast retrieval.

 Excellent performance for known key values.

Disadvantages:

 Handling collisions can be challenging, poten ally leading to


performance issues.

 Not suitable for range queries or par al matches.

Features:

 Uses a hash func on to determine the physical loca on of records


based on key values.

 Requires collision resolu on techniques, such as chaining or open


addressing.

Applica ons:

 Ideal for situa ons where direct access to records is essen al, such
as fast retrieval of unique keys.

Indexed Sequen al Access Method (ISAM):

Indexed Sequen al Access Method (ISAM) is a data storage and access


method that combines elements of both sequen al access and indexed
access. It is used in database management systems and file systems to
efficiently organize and retrieve data from disk storage, par cularly when
the data is stored in sorted order based on a specific key.

Fig-Indexed Sequen al Access Method (ISAM)

Advantages:

 Faster data retrieval compared to heap files due to indexing.

 Efficient for both sequen al and direct access to records.

Disadvantages:

 Inser on and dele on opera ons can be slower due to the need to
update indexes.

Features:

 Combines sequen al file organiza on with an addi onal index file.

 Index file contains pointers to physical loca ons of records in the


data file.

Applica ons:
 Used when a balance between sequen al and random access is
required.

Bitmap Index:

A bitmap index is a type of data structure used in computer science and


databases to op mize the speed of data retrieval opera ons, especially
in read-heavy scenarios. It is a specialized indexing technique that
efficiently represents and stores boolean or categorical data for rapid
searching and filtering.

In a tradi onal database index, you would have a B-tree or a similar


structure that stores pointers to the actual data rows based on the
indexed column's values. However, in a bitmap index, instead of storing
pointers, it uses bitmaps to represent the presence or absence of specific
values in the indexed column.

Fig-Bitmap Index

Advantages:

 Space-efficient for columns with a large number of dis nct values.

 Fast for filtering data based on the presence or absence of specific


values.

Disadvantages:
 Can be memory-intensive for datasets with many dis nct values.

 Not suitable for datasets with high cardinality.

Features:

 Uses a bitmap to represent the presence or absence of each value


for a column.

Applica ons:

 Commonly used in data warehousing and OLAP applica ons for


quickly filtering data based on specific a ributes.

The choice of file structure in physical database design depends on


factors such as data size, query types, insert/update frequency, and the
hardware/storage environment. Selec ng the appropriate file structure
is crucial for achieving op mal performance and efficiency in the
database system.

You might also like