0% found this document useful (0 votes)
127 views

Physical Database Design and Tuning

Uploaded by

Chiran Govinna
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
127 views

Physical Database Design and Tuning

Uploaded by

Chiran Govinna
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

22/05/2024

Factors that Influence Physical Database


Design
A. Analyzing the database queries and transactions
• Queries and transactions expected to run on the database
B. Analyzing the expected frequency of invocation of queries and transactions

Physical Database Design and


• Expected frequency of use for all queries and transactions.
• 80-20% rule: 80% of the processing happens with 20% of queries and transactions.
C. Analyzing the time constraints of queries and transactions

Tuning D.
• Any timing constraints. E.g. Transaction should terminate within 5 seconds on 95% of the
time. It should never take more than 20 seconds.
Analyzing the expected frequencies of update operations
• Minimum number of access paths (e.g. indexes) should be specified if a file is frequently
updated.
E. Analyzing the uniqueness constraints on attributes.
• Access paths should be specified on all candidate key attributes or set of attributes that are
either the primary key of a file or unique attributes.

Physical Database Design Decisions


1. Design decisions about indexing
• Whether to index an attribute?
• What attribute or attributes to index on?
• Whether to set up a clustered index?
• Whether to use a hash index over a tree index?
• Whether to use dynamic hashing for the file?
2. Denormalization as a design decision for speeding up queries
Database Tuning
22/05/2024

Database tuning Inputs to the tuning process


• The process of continuing to revise/adjust the physical database • Statistics internally collected in • Statistics obtained from
design by monitoring resource utilization as well as internal DBMS DBMSs monitoring
processing to reveal bottlenecks such as contention for the same data • Size of individual tables • Storage statistics
or devices. • Number of distinct values in a • I/O and device performance
column statistics
• The number of times a particular • Query/transaction processing
query or transaction is statistics
• Goals submitted/executed in an interval • Locking/logging related statistics
• To make application run faster of time • Index statistics
• To lower the response time of queries/transactions • The times required for different
phases of query and transaction
• To improve the overall throughput of transactions processing

Problems to be considered in tuning Tuning Indexes


• How to avoid excessive lock contention? • Reasons to tuning indexes
• How to minimize overhead of logging and unnecessary dumping of • Certain queries may take too long to run for lack of an index;
data? • Certain indexes may not get utilized at all;
• Certain indexes may be causing excessive overhead because the index is on
• How to optimize buffer size and scheduling of processes? an attribute that undergoes frequent changes
• How to allocate resources such as disks, RAM and processes for
most efficient utilization? • Options to tuning indexes
• Drop or/and build new indexes
• Change a non-clustered index to a clustered index (and vice versa)
• Rebuilding the index
22/05/2024

Tuning the Database Design Tuning the Database Design (cont.)


• Dynamically changed processing requirements need to be addressed • Possible changes to the database design
by making changes to the conceptual schema if necessary and to • A relation of the form R(K, A, B, C, D, …) that is in BCNF can be stored into
reflect those changes into the logical schema and physical design. multiple tables that are also in BCNF by replicating the key K in each table.
This is called vertical partitioning.
• Possible changes to the database design • E.g. EMPLOYEE(SSN, Name, Phone, Grade, Salary) may be split into EMP1(SSN, Name,
Phone) and EMP2(SSN, Grade, Salary)
• Existing tables may be joined (denormalized) because certain attributes • Attribute(s) from one table may be repeated in another even though this
from two or more tables are frequently needed together. This reduced the creates redundancy and potential anomalies.
normalization level from BCNF to 3NF, 2NF, or 1NF. • E.g. Part_name may appear wherever the Part # appears. However, you may have a
• For the given set of tables, there may be alternative design choices, all of one master file such as PART_MASTER(Part#, Part_name, …)
which achieve 3NF or BCNF. One may be replaced by the other. • Horizontal partitioning takes horizontal slices of a table and stores them as
distinct tables.
• E.g. product sales data may be separated in to different product lines. Each table has
same attributes but contains a distinct set of products.

Tuning Queries Tuning Queries (cont.)


• Indications for tuning queries • Typical instances for query tuning (cont.)
• A query issues too many disk accesses 3. Some DISTINCTs may be redundant and can be avoided without
• The query plan shows that relevant indexes are not being used. changing the result.
• Typical instances for query tuning 4. Unnecessary use of temporary result tables can be avoided by
1. Many query optimizers do not use indexes in the presence of collapsing multiple queries into a single query unless the
arithmetic expressions (e.g. Salary/365 > 10.50), numerical temporary relation is needed for some intermediate processing.
comparisons of attributes of different sizes and precision (e.g.
comparing INTEGER with SMALLINTEGER type attributes), NULL 5. In some situations involving using of correlated queries,
comparisons (e.g. Bdate IS NULL), and sub-string comparisons (e.g. temporaries are useful.
LIKE ‘%mann’).
2. Indexes are often not used for nested queries using IN;
22/05/2024

Tuning Queries (cont.) Additional Query Tuning Guidelines


• Typical instances for query tuning (cont.) 1. A query with multiple selection conditions that are connected via OR may not
6. If multiple options for join condition are possible, choose one that be prompting the query optimizer to use any index. Such a query may be split
uses a clustering index and avoid those that contain string up and expressed as a union of queries, each with a condition on an attribute
comparisons. that causes an index to be used.
• E.g. Use EMPLOYEE.SSN = STUDENT.SSN join condition compared to
EMPLOYEE.Name=STUDENT.Name
7. The order of tables in the FROM clause may affect the join SELECT Fname, Lname, Salary, Age SELECT Fname, Lname, Salary, Age
processing. FROM EMPLOYEE FROM EMPLOYEE
WHERE Age>45
8. Some query optimizers perform worse on nested queries compared WHERE Age>45 OR Salary<50000; UNION
to their equivalent un-nested counterparts. SELECT Fname, Lname, Salary, Age
9. Many applications are based on views that define the data of interest FROM EMPLOYEE
to those applications. Sometimes these views become an overkill. WHERE Salary<50000;

Additional Query Tuning Guidelines Query optimization in MySQL


2. Apply the following transformations • The EXPLAIN statement in MySQL provides information about how
• NOT condition may be transformed into a positive expression. MySQL executes statements.
• Embedded SELECT blocks (e.g. IN, ALL, SOME) may be replaced by joins.
• If an equality join is set up between two tables, the range predicate on the joining • When EXPLAIN is used with a statement, MySQL displays information
attribute set up in one table may be repeated for the other table from the optimizer about the statement execution plan.
3. WHERE conditions may be rewritten to utilize the indexes on multiple • MySQL Workbench provides a graphical representation for EXPLAIN
columns. (called Visual Explain).
Index Region# (a) vs composite index on (Region#, Prod_type) (b)
(a) SELECT Region#, Prod_type, Month, Sales
FROM SALES_STATISTICS
WHERE Region# = 3 AND ((Prod_type BETWEEN 1 AND 3) OR (Prod_type BETWEEN 8 AND 10));

(b) SELECT Region#, Prod_type, Month, Sales


FROM SALES_STATISTICS
WHERE (Region# = 3 AND (Prod_type BETWEEN 1 AND 3)) OR (Region# = 3 AND (Prod_type BETWEEN 8 AND 10));
22/05/2024

Explain Output Explain tells you


• id the sequential number of the table(s)
• In which order the tables are read
• select_type the type of SELECT
• table the name of the table or alias • What types of read operations that are made
• type the type of join for the query
• possible_keys which indexes MySQL could use
• Which indexes could have been used
• keys which indexes MySQL will use • Which indexes are used
• key_len the length of keys used
• ref any columns used with the key to retrieve results
• How the tables refer to each other
• rows estimated number of rows returned • How many rows the optimizer estimates to retrieve
• extra any additional information from each table

Visual Explain / Execution Plan in MySQL


Workbench

You might also like