0% found this document useful (0 votes)

1 views39 pages

8 Query Optimization

The document covers disk storage and indexing in databases, detailing the types of storage, file organizations, and indexing methods such as primary, clustering, and secondary indexes. It emphasizes the importance of indexing for efficient data retrieval and query optimization, which reduces resource usage and improves performance. Additionally, it discusses query optimization techniques, including cost-based and rule-based optimization, to enhance database query execution strategies.

Uploaded by

hudaaghazi22

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views39 pages

8 Query Optimization

Uploaded by

hudaaghazi22

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

University of Salahaddin

College of Engineering
Software & Informatics Engineering Department

DISK STORAGE AND

INDEXING
2ND STAGE
Lecturer:
Dr. Hanan Kamal
2

Content
• Introduction
• Storage of Database
• Files
• Types of Single-Level Ordered Indexes
• Primary Indexes
• Clustering Indexes
• Secondary Indexes
3

Introduction
• Computerized database → (physically) computer storage
medium.
• Computer storage media form a storage hierarchy that
includes two main categories:
1. Primary storage
2. Secondary and tertiary storage

• Usually Primary :
• Provides fast access to data but is of limited storage capacity.
• More expensive and have less storage capacity than secondary
and tertiary storage devices.
4

Storage of Databases
• Databases typically store large amounts of data that must
persist over long periods of time, and hence is often
referred to as persistent data.
• Transient data Parts of data are accessed and processed
repeatedly during this period.
• Databases are stored permanently (or persistently) on
magnetic disk, for the following reasons:
• Size
• losses
• The cost
5

Storage of Databases (cont.)

• The data stored on disk is organized as files of records.
• Each record is a collection of data values that can be
interpreted as facts about entities, their attributes, and
their relationships.
• There are several primary file organizations
• heap file (or unordered file)
• sorted file (or sequential file) (sort key).
• hashed file (hash key)
• B-trees
6

Files
Unordered (heap) files Ordered (Sorted) Files

1. New records are 1. File records are kept

inserted at the end of sorted .
the file. 2. A binary search can be
2. To search for a record, used
a linear search . 3. Insertion is expensive:
3. Record insertion is records must be
quite efficient. inserted in the correct
4. Reading the records in order.
order of a particular 4. Reading the records in
field requires sorting order of the ordering
the file records. field is quite efficient.
7

• Some blocks of
an ordered
(sequential) file
of EMPLOYEE
records with
Name as the
ordering key
field.
What is Indexing?

• A data structure to quickly locate records without scanning

the entire table.
• Acts like a book index for faster access.
• Improves read/query performance significantly.
Why Indexing Matters

• Reduces I/O cost of query execution.

• Improves performance for SELECT, JOIN, and ORDER
BY.
• Supports constraints like UNIQUE and PRIMARY KEY.
10

Types of Single-Level Ordered Indexes

• Indexing idea (book index)
• An index access structure is usually defined on a single field of
a file, called an indexing field (or indexing attribute).

• Indexes can also be characterized as dense or

sparse.
• A dense index has an index entry for every search
key value (and hence every record) in the data file.

• A sparse (or nondense) index, has index entries

for only some of the search values.
11

Primary Indexes
• A primary index is an ordered file whose records are of
fixed length with two fields
• Index entry (or index record) in the index file, will refer to
the two field values of index entry i as <K(i), P(i)>.

• K(i) is of t he same data type as the ordering

key Field called the primary key of the data file
• P(i) is a pointer to a disk block (a block
address).
12
13

Clustering Indexes
• If file records are physically ordered on a nonkey field
which does not have a distinct value for each record that
field is called the clustering field and the data file is
called a clustered file.

• A clustering index is also an ordered file with two

fields:
1. field is of the same type as the clustering field of the
data file
2. disk block pointer.
14
15

Secondary Indexes
• A secondary index provides a secondary means of
accessing a data file for which some primary access
already exists.
• The data file records could be ordered, unordered.

• The secondary index may be created on:

1. Candidate key and has a unique value in every record
2. Nonkey field with duplicate values.
16

Secondary index on a key, nonordering field of a

file.
17

Secondary index on a nonkey, nonordering field

of a file.
18

Summary of Indexes
19

Thank you
20

1. What is indexing?
2. Talk about
A.Primary Indexes
B.Clustering Indexes
C.Secondary Indexes
University of Salahaddin
College of Engineering
Software & Informatics Engineering Department

QUERY OPTIMIZATION
ND
2 STAGE

Lecturer:
Dr. Hanan Kamal
22

Introduction
• In network and hierarchical DBMSs, low-level
procedural query language is generally
embedded in high-level programming language.
• Programmer’s responsibility to select most
appropriate execution strategy.
• With declarative languages such as SQL, user
specifies what data is required rather than how it
is to be retrieved.
• Relieves user of knowing what constitutes good
execution strategy.
23

Introduction
• Also gives DBMS more control over system
performance.

• Two main techniques for query

optimization:
• Heuristic rules that order operations in a query;
• Comparing different strategies based on relative
costs, and selecting one that minimizes
resource usage.

• Disk access tends to be dominant cost in

query processing for centralized DBMS.
24

Query Processing

Activities involved in retrieving data from

the database.

• Aims of QP:
• Transform query written in high-level language
(e.g. SQL), into correct and efficient execution
strategy expressed in low-level language
(implementing RA);
• Execute strategy to retrieve required data.
Introduction to Query Optimization
• Query optimization (QO) is the process of
choosing the most efficient way to execute a SQL
query.

• The importance of QO:

- Reduces response time
- Minimizes resource usage
- Essential for large-scale databases
26

Query Optimization
• A query optimizer evaluates multiple query execution
plans and chooses the best one based on cost
estimates.
• As there are many equivalent transformations of same
high-level query, aim of QO is to choose one that
minimizes resource usage.
• Generally, reduce total execution time of query.
• May also reduce response time of query.
• Problem computationally intractable with large number of
relations, so strategy adopted is reduced to finding near
optimum solution.
Goals of Query Optimization
- Minimize I/O cost
- Reduce CPU usage
- Ensure fastest response time
- Avoid unnecessary computations
Cost-Based Optimization
This method evaluates alternative plans using a
cost model based on:
- Disk I/O
- CPU cost
- Network latency
Rule-Based Optimization
Uses heuristics (rules) to transform queries:
- Push selections early
- Combine projections
- Join order optimization
30

Example 21.1 - Different Strategies

Find all Managers who work at a London

branch.

SELECT *
FROM Staff s, Branch b
WHERE s.branchNo = b.branchNo AND
(s.position = ‘Manager’ AND b.city = ‘London’);
31

Example 21.1 - Different Strategies

• Three equivalent RA queries are:
(1) (position='Manager')  (city='London') 
(Staff.branchNo=Branch.branchNo) (Staff X Branch)
(2) (position='Manager')  (city='London')(
Staff Staff.branchNo=Branch.branchNo Branch)
(3) (position='Manager'(Staff))
Staff.branchNo=Branch.branchNo
(city='London' (Branch))
Join Optimization
Join operations can be expensive. Types of join
algorithms:
- Nested Loop Join
- Hash Join
- Merge Join
Subquery Optimization
• Rewrite subqueries as joins or EXISTS to
improve performance.
View Materialization
• Views can be materialized (stored) or
virtual (computed per query).
Materialization improves performance at
the cost of storage.
Statistics for Optimization
Database optimizers use table statistics:
- Row counts
- Value distributions
- Index usage
Query Rewriting Techniques
- Remove redundant conditions
- Simplify expressions
- Use EXISTS instead of IN
37

Phases of Query Processing

Summary
✔ Query optimization is crucial for efficient
database systems
✔ Use indexes, rewrite queries, and analyze
execution plans
✔ Know your data and schema
39

Thank you

7-Indexing and Block
No ratings yet
7-Indexing and Block
20 pages
Index and Hashing 2017 Combined
No ratings yet
Index and Hashing 2017 Combined
60 pages
Indexing
No ratings yet
Indexing
62 pages
ADB - CH2 - Advanced SQL
No ratings yet
ADB - CH2 - Advanced SQL
60 pages
Lec6 QP Indexing
No ratings yet
Lec6 QP Indexing
40 pages
Querry Processing and Indexing, Hashing
No ratings yet
Querry Processing and Indexing, Hashing
24 pages
Indexing in Database
No ratings yet
Indexing in Database
33 pages
Lec20Indexing_v1
No ratings yet
Lec20Indexing_v1
57 pages
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
No ratings yet
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
20 pages
Unit-4 DBMS Merged
No ratings yet
Unit-4 DBMS Merged
156 pages
Module Iippt
No ratings yet
Module Iippt
27 pages
File Structure Data Storage Query Evaluation Indexing and Hashing
No ratings yet
File Structure Data Storage Query Evaluation Indexing and Hashing
14 pages
Index & Query Optimization
No ratings yet
Index & Query Optimization
21 pages
W5 Storage Files Indexing pt1
No ratings yet
W5 Storage Files Indexing pt1
61 pages
Chapter - 2 - Revision
No ratings yet
Chapter - 2 - Revision
26 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
Query Optimization
No ratings yet
Query Optimization
9 pages
File Organization
No ratings yet
File Organization
41 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
index1 (5)
No ratings yet
index1 (5)
25 pages
DBMS Unit 5
No ratings yet
DBMS Unit 5
58 pages
Query Processing Query Optimization
No ratings yet
Query Processing Query Optimization
4 pages
Unit5 File Organization
No ratings yet
Unit5 File Organization
112 pages
DBMS_UNIT_5_NOTES
No ratings yet
DBMS_UNIT_5_NOTES
28 pages
11.2 Indexing
No ratings yet
11.2 Indexing
26 pages
File Storage and Indexing: Lesson 13 Cs 3200 Kathleen Durant PHD
No ratings yet
File Storage and Indexing: Lesson 13 Cs 3200 Kathleen Durant PHD
46 pages
Databases LEVEL 3 Notes
No ratings yet
Databases LEVEL 3 Notes
29 pages
PPT-203105251-3
No ratings yet
PPT-203105251-3
35 pages
Query Optimization in Databases
No ratings yet
Query Optimization in Databases
6 pages
Query Processing, Optimization, and Indexing Techniques
No ratings yet
Query Processing, Optimization, and Indexing Techniques
29 pages
Unit 6 notes DBMS final
No ratings yet
Unit 6 notes DBMS final
14 pages
Query Processing and Query Optimization Techniques
No ratings yet
Query Processing and Query Optimization Techniques
20 pages
DINLect1.pptx
No ratings yet
DINLect1.pptx
69 pages
L6 Query Optimization
No ratings yet
L6 Query Optimization
52 pages
Unit-6 Storage Strategies
No ratings yet
Unit-6 Storage Strategies
43 pages
Indexing Lecture Nov 2023 Detailed
No ratings yet
Indexing Lecture Nov 2023 Detailed
37 pages
Mod4 Chap10 - 11 Indexing
No ratings yet
Mod4 Chap10 - 11 Indexing
77 pages
Database
No ratings yet
Database
4 pages
Indexing Lecture Nov 2023 Summary
No ratings yet
Indexing Lecture Nov 2023 Summary
41 pages
Class 6
No ratings yet
Class 6
15 pages
Tuning SQL Queries - Oracle
100% (1)
Tuning SQL Queries - Oracle
27 pages
DBMS Storage and Indexing
No ratings yet
DBMS Storage and Indexing
80 pages
UEU Basis Data Pertemuan 14
No ratings yet
UEU Basis Data Pertemuan 14
32 pages
Unit 5
No ratings yet
Unit 5
185 pages
CH 3 Index
No ratings yet
CH 3 Index
40 pages
Layers of a DBMS
No ratings yet
Layers of a DBMS
38 pages
IT212 LECTURE 7
No ratings yet
IT212 LECTURE 7
9 pages
Tuning
100% (2)
Tuning
29 pages
Lecture3 File Orgn
No ratings yet
Lecture3 File Orgn
13 pages
L4 Indexing
No ratings yet
L4 Indexing
56 pages
DBMS A1
No ratings yet
DBMS A1
10 pages
dbms 3 sem
No ratings yet
dbms 3 sem
31 pages
Unit -5 - part 2
No ratings yet
Unit -5 - part 2
33 pages
CO2- INDEX IN DBMS 1
No ratings yet
CO2- INDEX IN DBMS 1
29 pages
26 - Databse Indexes
No ratings yet
26 - Databse Indexes
48 pages
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
DB2 Administration and Optimization Guide: Definitive Reference for Developers and Engineers
From Everand
DB2 Administration and Optimization Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Decoding Oracle Database: A Comprehensive Guide to Mastery
From Everand
Decoding Oracle Database: A Comprehensive Guide to Mastery
Kameron Hussain
No ratings yet
Mastering DuckDB: High-Performance Analytics Made Easy
From Everand
Mastering DuckDB: High-Performance Analytics Made Easy
Robert Johnson
No ratings yet
Advanced Database System
No ratings yet
Advanced Database System
66 pages
A Survey of Distributed Query Optimization
No ratings yet
A Survey of Distributed Query Optimization
10 pages
Advanced Database Chapter Two Query Processing and Optimization
100% (1)
Advanced Database Chapter Two Query Processing and Optimization
43 pages
Query
No ratings yet
Query
104 pages
Final DBMS Unit 7
No ratings yet
Final DBMS Unit 7
48 pages
DDBS
No ratings yet
DDBS
19 pages
CHAPTER - 02 - Query Processing - CS 2nd Year - 2016
No ratings yet
CHAPTER - 02 - Query Processing - CS 2nd Year - 2016
49 pages
4-Query Processing Nhom1
No ratings yet
4-Query Processing Nhom1
73 pages
Unit 2
No ratings yet
Unit 2
28 pages
Dremio Data As A Service
100% (1)
Dremio Data As A Service
16 pages
CZ4031 Project 2 Report
No ratings yet
CZ4031 Project 2 Report
34 pages
Introduction To Database Systems (CS 4320 at Cornell) : Immanuel Trummer
No ratings yet
Introduction To Database Systems (CS 4320 at Cornell) : Immanuel Trummer
33 pages
Adv. Dbms MCQ
75% (12)
Adv. Dbms MCQ
47 pages
Enhanced SQL Trace Utility From Oracle: Oracle Tips by Burleson Consulting
No ratings yet
Enhanced SQL Trace Utility From Oracle: Oracle Tips by Burleson Consulting
19 pages
Assignment 04
No ratings yet
Assignment 04
10 pages
Query Processing in Distributed Database
No ratings yet
Query Processing in Distributed Database
24 pages
Execution Plan Basics - Simple Talk
100% (1)
Execution Plan Basics - Simple Talk
34 pages
Static vs Dynamic Query Optimization [23027119-003, Qaiser Ali]
No ratings yet
Static vs Dynamic Query Optimization [23027119-003, Qaiser Ali]
9 pages
Database Virtualization
No ratings yet
Database Virtualization
7 pages
DDS Unit - 5
No ratings yet
DDS Unit - 5
27 pages
10987C ENU PowerPoint Day 3
No ratings yet
10987C ENU PowerPoint Day 3
125 pages
Dawak 2024
No ratings yet
Dawak 2024
15 pages
Mathematics 11 01383
No ratings yet
Mathematics 11 01383
18 pages
06 Query Processing (2) - NDN
No ratings yet
06 Query Processing (2) - NDN
31 pages
Thesis On Query Optimization
100% (3)
Thesis On Query Optimization
5 pages
Life Cycle of A MongoDB Query
No ratings yet
Life Cycle of A MongoDB Query
5 pages
Tuning SQL Queries For Better Performance in Management Information Systems Using Large Set of Data
No ratings yet
Tuning SQL Queries For Better Performance in Management Information Systems Using Large Set of Data
10 pages
ADBMS Chapter One
No ratings yet
ADBMS Chapter One
21 pages
Cs9152 DBT Unit IV Notes
No ratings yet
Cs9152 DBT Unit IV Notes
61 pages
Chapter 20
No ratings yet
Chapter 20
99 pages

8 Query Optimization

Uploaded by

8 Query Optimization

Uploaded by

University of Salahaddin

DISK STORAGE AND

Storage of Databases (cont.)

1. New records are 1. File records are kept

• A data structure to quickly locate records without scanning

• Reduces I/O cost of query execution.

Types of Single-Level Ordered Indexes

• Indexes can also be characterized as dense or

• A sparse (or nondense) index, has index entries

• K(i) is of t he same data type as the ordering

• A clustering index is also an ordered file with two

• The secondary index may be created on:

Secondary index on a key, nonordering field of a

Secondary index on a nonkey, nonordering field

• Two main techniques for query

• Disk access tends to be dominant cost in

Activities involved in retrieving data from

• The importance of QO:

Example 21.1 - Different Strategies

Find all Managers who work at a London

Example 21.1 - Different Strategies

Phases of Query Processing

You might also like