8 Query Optimization
8 Query Optimization
College of Engineering
Software & Informatics Engineering Department
Content
• Introduction
• Storage of Database
• Files
• Types of Single-Level Ordered Indexes
• Primary Indexes
• Clustering Indexes
• Secondary Indexes
3
Introduction
• Computerized database → (physically) computer storage
medium.
• Computer storage media form a storage hierarchy that
includes two main categories:
1. Primary storage
2. Secondary and tertiary storage
• Usually Primary :
• Provides fast access to data but is of limited storage capacity.
• More expensive and have less storage capacity than secondary
and tertiary storage devices.
4
Storage of Databases
• Databases typically store large amounts of data that must
persist over long periods of time, and hence is often
referred to as persistent data.
• Transient data Parts of data are accessed and processed
repeatedly during this period.
• Databases are stored permanently (or persistently) on
magnetic disk, for the following reasons:
• Size
• losses
• The cost
5
Files
Unordered (heap) files Ordered (Sorted) Files
• Some blocks of
an ordered
(sequential) file
of EMPLOYEE
records with
Name as the
ordering key
field.
What is Indexing?
Primary Indexes
• A primary index is an ordered file whose records are of
fixed length with two fields
• Index entry (or index record) in the index file, will refer to
the two field values of index entry i as <K(i), P(i)>.
Clustering Indexes
• If file records are physically ordered on a nonkey field
which does not have a distinct value for each record that
field is called the clustering field and the data file is
called a clustered file.
Secondary Indexes
• A secondary index provides a secondary means of
accessing a data file for which some primary access
already exists.
• The data file records could be ordered, unordered.
Summary of Indexes
19
Thank you
20
1. What is indexing?
2. Talk about
A.Primary Indexes
B.Clustering Indexes
C.Secondary Indexes
University of Salahaddin
College of Engineering
Software & Informatics Engineering Department
QUERY OPTIMIZATION
ND
2 STAGE
Lecturer:
Dr. Hanan Kamal
22
Introduction
• In network and hierarchical DBMSs, low-level
procedural query language is generally
embedded in high-level programming language.
• Programmer’s responsibility to select most
appropriate execution strategy.
• With declarative languages such as SQL, user
specifies what data is required rather than how it
is to be retrieved.
• Relieves user of knowing what constitutes good
execution strategy.
23
Introduction
• Also gives DBMS more control over system
performance.
Query Processing
• Aims of QP:
• Transform query written in high-level language
(e.g. SQL), into correct and efficient execution
strategy expressed in low-level language
(implementing RA);
• Execute strategy to retrieve required data.
Introduction to Query Optimization
• Query optimization (QO) is the process of
choosing the most efficient way to execute a SQL
query.
Query Optimization
• A query optimizer evaluates multiple query execution
plans and chooses the best one based on cost
estimates.
• As there are many equivalent transformations of same
high-level query, aim of QO is to choose one that
minimizes resource usage.
• Generally, reduce total execution time of query.
• May also reduce response time of query.
• Problem computationally intractable with large number of
relations, so strategy adopted is reduced to finding near
optimum solution.
Goals of Query Optimization
- Minimize I/O cost
- Reduce CPU usage
- Ensure fastest response time
- Avoid unnecessary computations
Cost-Based Optimization
This method evaluates alternative plans using a
cost model based on:
- Disk I/O
- CPU cost
- Network latency
Rule-Based Optimization
Uses heuristics (rules) to transform queries:
- Push selections early
- Combine projections
- Join order optimization
30
SELECT *
FROM Staff s, Branch b
WHERE s.branchNo = b.branchNo AND
(s.position = ‘Manager’ AND b.city = ‘London’);
31
Thank you