Top SQL Tuning Tips for Db2 Developers
Top SQL Tuning Tips for Db2 Developers
Tony Andrews
Themis Inc.
Session code: E04
Monday 4:30 Cross Platform
Most relational tuning experts agree that the majority of performance problems with
applications that access a relational database are caused by poorly coded programs or
improperly coded SQL. Industry experts agree that poorly performing SQL is responsible for
many response-time issues. SQL developers should be informed of the many performance
issues associated with the SQL language and the way they design their programs. It would be
especially helpful if more developers were educated in how to read and analyze Db2 Explain
output. This presentation was a hit at many RUGs years ago, and I thought it was time to
reinvent it due to the many SQL and optimization changes in the past years. This is good for
both z/OS and LUW platform developers. Attendees will leave with many tips, standards, and
guidelines for good SQL programming 'Best Practices'.
1
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Agenda - Objectives
2
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
So many times programmers today lose sight of the second goal. They either:
3
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
But we still need some coding rules for the best efficiencies!!
4
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Developers should review their own SQL code and make sure it is:
- Not doing more work than needed
- Not bring back more data (columns and/or rows) needed
- Not doing unneeded sorts
5
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Multi Row Fetch, Update, and Inserting. Recursive SQL. Select from Insert.
‘Merge’ processing. Fetch First / Order By within subqueries.
2) Minimize the number of fetches by using multi row fetch. Take advantage of multi row
deletes/inserts also.
3) Distributed Apps: Once in compatibility mode in V8 the blocks used for block fetching are
built using the multi-row capability without any code change. This results in automatic savings
for example distributed SQLJ applications.
6
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Access Path
Through the Data Manipulation Language (DML) the user of a Db2 database supplies the “WHAT”; that
is, the data that is needed from the database to satisfy the business requirements. Db2 then uses the
information in the Db2 Catalog to resolve “WHERE” the data resides. The Db2 Optimizer is then
responsible for determining the all important “HOW” to access the data most efficiently.
Ideally, the user of a relational database is not concerned with how the system accesses data. This is
probably true for an end user of Db2, who writes SQL queries quickly for one-time or occasional use. It
is less true for developers who write application pro-grams and transactions, some of which will be
executed thou-sands of times a day. For these cases, some attention to Db2 access methods can
significantly improve performance.
Db2’s access paths can be influenced in four ways:
♦ By rewriting a query in a more efficient form.
♦ By creating, altering, or dropping indexes.
♦ By updating the catalog statistics that Db2 uses to estimate access costs.
♦ By utilizing Optimizer Hints.
Watch out for different RID pool sizing from production and test environments. The row id (RID) pool is
used for the RID sorts that accompany optimizer access path techniques such as list pre-fetch, hybrid
join, and multi-index access. These pool sizes may vary from production environments to test
environments with typically more RID pool sort area in production. This can at times cause the access
path to be different in different environments
Other items affecting optimization:
♦ Buffer Pools.
♦ Rid Pools.
7
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Stage 1 = Sargeable
Stage 2 = Non Sargeable. Predicate processing by this RDS are of Db2 is much more expensive
than the RDS Stage 1 area. Additional processing, additional code path, much more
expensive then stage 1.
Indexable predicates evaluated first, Stage 1 predicates, next, and Stage 2 predicates last.
8
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Use the Visual Explain in IBM Data Studio or query directly the
DSN_PREDICAT_TABLE to see any stage 2 predicates. Note the filter factor
information also.
Stage 2 Predicates
1) Click on the FETCH box to see any/all predicates not associated with the index chosen
2) Click on the IXSCAN boxes to see matching index and screening index predicate information
9
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Index on expressions.
1) If the expression contains a column reference and there exists an index on the expression,
then the following become Stage 1 predicates.
[Link] = T2 col expr, [Link] <> T2 col expr, expression = value, expression <> value,
expression op value , expression op (subquery)
Note: When an index using Upper or Lower, the locale must be specified.
APAR (PK68295) will remove the locale requirement
10
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
This table contains an index (PRSTDATE, PRENDATE) and because of the YEAR function, Db2 did
not choose to use the index, and the predicate is shown as Stage 2.
11
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Db2 11 rewrites some of the more common stage 2 local predicates, including the following
predicates, to an indexable form:
Db2 9 for z/OS delivered the ability to create an index on an expression, which required the
developer or DBA to identify the candidate queries and create the targeted indexes. The Db2 11
predicate rewrites allow optimal performance without needing to intervene for better
performance.
Note: Db2 will only rewrite if there is no index on expression that matches.
Example1:
WHERE SUBSTR(LASTNAME,1,3) = :hv is a stage 2 non indexabe predicate
V11, this becomes:
WHERE LASTNAME = (exp) is a stage 1 indexable (exp is a Db2 computed value for boundaries of
column)
. Example: SUBSTR(LASTNAME,1,3) =‘AND’ becomes LASTNAME BETWEEN ‘AND……’ and
‘ANDzzzzzzz’
Example2:
WHERE SUBSTR(LASTNAME,1,3) <= :hv is a stage 2 non indexabe predicate
V11, this becomes:
WHERE LASTNAME <= (exp) is a stage 1 indexable (exp is a Db2 computed value for
boundaries of column)
. Example: SUBSTR(LASTNAME,1,3) <=‘AND’ becomes LASTNAME <= ‘C1D5C4FFFFFFFFFFFFFFFFFF’
12
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
For example:
SELECT EMPNO, LASTNAME SELECT EMPN, LASTNAME
FROM EMPLOYEE FROM EMPLOYEE
WHERE SALARY * 1.1 > 50000.00 WHERE HIREDATE – 7 DAYS > ?
13
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
‘Distinct’ is optimized just like the ‘Group By’ and looks to take advantage of any index
(unique or non unique) to handle the eliminating of duplicates without a sort involved (sort
avoidance). The questions to ask: Is the Distinct or Group By needed?
Where are the duplicates coming from?
1) There are sort enhancements for both ‘Distinct’ and ‘Group By’ with no column function.
Was already available prior to V9 with ‘Group By’ and a column function. It now handles
the duplicates more efficiently in the input phase, elimination a step 2 passing of data to a
sort merge.
2) But developers should first ask… does the query need the Distinct or Group By?
14
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Often times, if one of the tables has no columns being selected from it, it can then be moved to
a subquery.
15
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
2) Minimize the number of fetches by using multi row fetch. Take advantage of multi row
deletes/inserts also.
3) Distributed Apps: Once in compatibility mode in V8 the blocks used for block fetching are
built using the multi-row capability without any code change. This results in automatic savings
for example distributed SQLJ applications.
16
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
The Visual Explain in IBM Data Studio shows a folder of any Stage 2 predicates involved in the
query.
The stage 1 and stage 2 predicate information are loaded into the DSN_PREDICAT_TABLE when
executing a Bind with Explain.
17
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
1) Click on the FETCH box to see any/all predicates not associated with the index chosen
2) Click on the IXSCAN boxes to see matching index and screening index predicate information
18
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Stage indexable predicates 31 This is a subset of many predicates from the manual.
19
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Stage 1 not indexable predicates 31 These predicates might be evaluated during stage 1
processing, during index screening, or after data page access.
20
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Stage 2 predicates
The predicates must be processed during stage 2, after the data is returned. This is a subset of
the many S2 predicates.
21
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Deviate only when performance is an issue and all other efforts have not provided
significant enough improvement in performance.
Often times developers may execute a ‘Fast Unload’ to a mainframe file, and have a program
read every record from the file and skip unwanted records. At times this can be very efficient
especially if the program is going to Process a large percentage of rows from the table.
22
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
The recommendation is to start with 100 row fetches, inserts, or updates, and then
test other numbers. It has been proven many times that this process reduces
runtime on average of 35%. Consult the IBM Db2 manuals for further detail and
coding examples.
23
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Many times the output needed from SQL development requires a combination of
detail and aggregate data together. There are typically a number of ways to code
this with SQL, but with the Scalar Fullselect now part of Db2, there is now another
option that is very efficient as long as indexes are being used. .
Not stating that this is the best option, but at times it may be. But it gives developers another
way to code for certain results.
Prior Options:
-----------------
1) SELECT [Link], [Link], [Link], [Link], X.DEPT_AVG_SAL
FROM EMPLOYEE E1 INNER JOIN
(SELECT [Link], DEC(ROUND(AVG([Link]),2),9,2)
AS DEPT_AVG_SAL
FROM EMP E2
GROUP BY [Link]) AS X ON [Link] = [Link]
ORDER BY [Link], [Link]
2) WITH X AS
(SELECT [Link], DEC(ROUND(AVG([Link]),2),9,2)
AS DEPT_AVG_SAL
FROM EMPLOYEE E2
GROUP BY [Link])
SELECT [Link], [Link], [Link], [Link], X.DEPT_AVG_SAL
FROM EMP E1 INNER JOIN
X ON [Link] = [Link]
ORDER BY [Link], [Link]
24
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
WITH X AS
(SELECT [Link], DEC(ROUND(AVG([Link]),2),9,2)
AS DEPT_AVG_SAL
FROM EMPLOYEE E2
GROUP BY [Link])
Prior Options:
-----------------
1) SELECT [Link], [Link], [Link], [Link], X.DEPT_AVG_SAL
FROM EMPLOYEE E1 INNER JOIN
(SELECT [Link], DEC(ROUND(AVG([Link]),2),9,2)
AS DEPT_AVG_SAL
FROM EMP E2
GROUP BY [Link]) AS X ON [Link] = [Link]
ORDER BY [Link], [Link]
2) WITH X AS
(SELECT [Link], DEC(ROUND(AVG([Link]),2),9,2)
AS DEPT_AVG_SAL
FROM EMPLOYEE E2
GROUP BY [Link])
25
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Prior Options:
-----------------
1) SELECT [Link], [Link], [Link], [Link], X.DEPT_AVG_SAL
FROM EMPLOYEE E1 INNER JOIN
(SELECT [Link], DEC(ROUND(AVG([Link]),2),9,2)
AS DEPT_AVG_SAL
FROM EMP E2
GROUP BY [Link]) AS X ON [Link] = [Link]
ORDER BY [Link], [Link]
2) WITH X AS
(SELECT [Link], DEC(ROUND(AVG([Link]),2),9,2)
AS DEPT_AVG_SAL
FROM EMPLOYEE E2
GROUP BY [Link])
26
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
What do you do? If you as a developer see that a tablespace scan is occurring in your
SQL execution, then go through the following checklist to help figure out why?
Typically we do not want to see tablespace scans, but there re times when a tablespace scan
will be more efficient than index processing:
2) When processing using a non clustered index, and the number of pages hit in the table is
high, whether many rows were returned or not.
Tablespace scans:
1) Will kick in ‘Sequential Prefetch’ and take advantage of asynchronous processing. V9 now
has a larger prefetch quantity (From 32 to 64 for SQL processing, 128 pages for utilities).
2) Without non clustered index processing, there will be no list prefetch sort.
27
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
10). Only code the columns needed in the Select portion of the SQL statement.
For example: The optimizer may choose a Merge Scan join if:
- The qualifying rows of both new and composite tables are many
- The join predicate does not provide a significant amount of filtering
- Few columns are selected on the new table, meaning that when Db2
sorts the new table, the more efficient the sort
28
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Sorts can be expensive. At times an SQL query may execute multiple sorts in order to get
the result set back as needed. Take a look at the Db2 explain tool to see if any sorting
is taking place, then take a look at the SQL statement and determine if anything can be
done to eliminate sorts. Data sorts are caused by:
- ‘Order By’
- ‘Group By’
- ‘Distinct’
- ‘Union’ versus ‘Union All’
- Join processing. Pay attention to the clustering order of data in tables.
- In List subqueries
Many time developers have sorts carried over from copied code
29
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Many time developers have sorts carried over from copied code
30
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Each of these will produce the same results, but operate very differently. Typically one will
perform better than the other depending on data distributions. For Example:
Global Query Optimization. Optimizer now tries to determine how an access path of one query block may affect the
others. This can be seen at times by Db2 rewriting an ‘Exists’ subquery into a join, or an ‘In’ subquery into an
‘Exists’ subquery . This is called ‘Correlating’ and ‘De-correlating’.
V9 – The optimizer takes into consideration correlated, non correlated, and join
31
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
This is done by executing the Runstats utility on each specific table and associated
indexes. This utility loads up the system catalog tables with data distribution information
that the optimizer looks for when selecting access paths. Some of the information that
the Runstats utility can provide is:
- The size of the tables (# of rows)
- The cardinalities of columns
- The percentage of rows (frequency) for those uneven distribution of column values
- The physical characteristics of the data and index files
- Information by partition
Pay attention to the ‘Statstime’ column in the catalog tables as it will state when the last
time Runstats has been executed on each table.
Volatile Tables. Db2 considers using Index access no matter the statistics
GTTs – Created can have manual statistics added.
1) FREQVAL Statistics are important for any columns containing uneven distribution of data
values. For example some tables may contain a status code column containing multiple values.
If any of the values contains a high or low percentage of rows In the table, then it should have
FREQVAL statistics run on that column.
2) Statistics are typically up to date in production, but many time are behind or even reset) in
test environments.
3) Volatile tables are always an issue. Statistics only reflect a point in time.
By declaring the table volatile, the optimizer will consider using index scan rather than table
scan. The access plans that use declared volatile tables will not depend on the existing statistics
for that table.
4) By creating a Global Temporary Table, manual statistics can then be added to the catalog
tables for the average number of rows, average cardinalities, etc…
32
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
1) FREQVAL Statistics are important for any columns containing uneven distribution of data
values. For example some tables may contain a status code column containing multiple values.
If any of the values contains a high or low percentage of rows In the table, then it should have
FREQVAL statistics run on that column.
2) Statistics are typically up to date in production, but many time are behind or even reset) in
test environments.
3) Volatile tables are always an issue. Statistics only reflect a point in time.
By declaring the table volatile, the optimizer will consider using index scan rather than table
scan. The access plans that use declared volatile tables will not depend on the existing statistics
for that table.
33
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
All tables in all environments should have the following statistics run:
NOTE: Frequency and Quantile stats are only as good as Db2 knowing the values
in associated predicates at optimization time. Hard code? Dynamic? Reopt?
1) Correlate columns:
*** Since 4 < 12, the columns CITY and STATE are said to be correlated.
2) Quantile Statistics
RUNSTATS INDEX("THEMIS81"."XEMP02"
HISTOGRAM NUMCOLS 1 NUMQUANTILES 20)
SHRLEVEL CHANGE REPORT YES
34
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
1) Correlate columns:
*** Since 4 < 12, the columns CITY and STATE are said to be correlated.
2) Quantile Statistics
RUNSTATS INDEX("THEMIS81"."XEMP02"
HISTOGRAM NUMCOLS 1 NUMQUANTILES 20)
SHRLEVEL CHANGE REPORT YES
35
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
For example: There exists a table with 1 million rows. In this table exists an index on the column Status_Cd. After a
typical Runstats utility is executed against this table, the optimizer will know that there are 3 different values for
the status code. After a special Runstats that specifies frequency value statistics for that column, Db2 will know
the following data distributions:
36
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
There are a couple of things to pay attention to when an SQL statement is processing using a
Correlated subquery. Correlated subqueries can get executed many times in order to fulfill the SQL
request. With this in mind, the subquery must be processed using an index to alleviate multiple
tablespace scans. If the correlated subquery is getting executed hundreds of thousands or millions
of times, then it is best to make sure the subquery gets executed using an index with Indexonly =
‘Yes’. This may require the altering of an already existing index.
For example:
37
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Often times queries with expression can be non-indexable and/or Stage 2. When coding predicates
containing expression, it’s best to:
1) Execute the expression prior to the SQL and put the answer into a variable that
matches the columns defintion.
2) Apply the appropriate scalar function to match the column defintion.
Note: This comparison of a column to a host variable of a different definition has been
improved a lot in Db2. Older versions if the data types and lengths of operands did not match,
the predicate was automatically a Stage 2 predicate. As of V8 if the data types and operands do
not match, but are within the same data type category (char, integers, decimals, etc.) the
predicates can be processed as Stage 1 and some may be indexable. This was to help other
languages like C and Java. C does not have a decimal data type, but needs to access data from
Db2 with decimal columns. Java does not have fixed length character string data types, only
variable character.
Still…. It’s a good habit for developers to have variables that match the column definitions.
Mostly due to any confusion of logic:
For example: SELECT * FROM EMP WHERE EDLEVEL = 12.25 EDLEVEL defined as a smallint?
Will this return any rows?
38
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
SELECT *
FROM EMP
WHERE EMPNO = 10
SELECT *
FROM EMP
WHERE NORMALIZE_DEFLOAT(CAST EMPNO AS
DECFLOAT(34) ) ) = NORMALIZE_DEFLOAT(10)
39
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
SELECT * SELECT *
FROM EMP FROM EMP
WHERE EMPNO = 10 WHERE SALARY = ‘52750.00’
SELECT * SELECT *
FROM EMP FROM EMP
WHERE EDLEVEL = 12.23 WHERE EMPNO = ‘10’
Getting the correct data precision to match a column’s data type is more logic error than
performance issues anymore. Db2 Covers incorrect precision in hard coded values and host
variables better now as most all stage 1 and indexable?
But do most developers know logically what happens when the comparison is of two different
data types?
4) Db2 looks for any EMPNO that equals ’10 ‘ (the number 10 followed by 4 spaces).
40
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Tables should be physically clustered in the order that they are typically processed by
queries processing the most data. This ensures the least amount of ‘Getpages’ when
processing, and can take advantage of sequential and dynamic prefetching.
Long running queries with ‘List Prefetch’ and ‘Sorts’ in many join processes are good
indicators that maybe a table is not in the correct physical order.
Queries that return the larger results sets are using a non-clustered index might be an
indicator.
Joins to a table is mostly by a foreign key and not the primary key might be an indicator.
Getting the correct physical clustering can save sorts, and I/Os.
41
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Db2 allows to select what was just inserted using the same statement saving multiple calls to Db2.
This again we call ‘Relational’ programming instead of ‘Procedural’ programming. The statement can
retrieve the following information.
42
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
For example:
SELECT EMPNO, SALARY SELECT EMPNO, LASTNAME
FROM OLD TABLE FROM OLD TABLE
(UPDATE EMP (DELETE FROM EMP
SET SALARY = SALARY * 1.1 WHERE DEPTNO = ‘C11’
WHERE DEPTNO = ‘C11’ )
)
43
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Example 1:
SELECT [Link], [Link] Example 4
Example 3:
FROM EMP E LEFT JOIN
DEPT ON [Link] = [Link] SELECT [Link]
SELECT [Link], [Link]
WHERE [Link] IS NULL FROM EMP E
FROM EMP E
Example 2: EXCEPT
WHERE [Link] NOT IN
SELECT [Link], [Link] SELECT MGRNO
(SELECT MGRNO FROM DEPT
FROM EMP E FROM DEPT
WHERE MGRNO IS NOT NULL )
WHERE NOT EXISTS
(SELECT 1 FROM DEPT
WHERE [Link] = [Link])
The ‘Not In’ logic is typically the worst performing, but if it is used and the column being
selected in the subquery is a nullable column, then nulls need to be eliminated from the ‘Not In’
list, or Db2 returns nothing.
44
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
This requires an extra call to Db2. Use the MERGE (Sometimes called ‘Upsert’ processing)
Example :
MERGE INTO EMPLOYEE E
USING (VALUES ('000999', 'TONY', 'ANDREWS', 'A00') )
AS NEWITEM (EMPNO, FIRSTNAME, LASTNAME,
DEPARTMENT)
ON [Link] = [Link]
WHEN MATCHED THEN
UPDATE SET FIRSTNAME = [Link],
LASTNAME = [Link]
WHEN NOT MATCHED THEN
INSERT (EMPNO, FIRSTNAME, LASTNAME, DEPARTMENT)
VALUES ([Link], [Link],
[Link], [Link])
An SQL Merge statement was introduced to better handle this exact situation.
The merge statement specifies to Db2 what to do on a matched condition (execute an Update)
or a non matched condition (execute an Insert), handling either condition within the same SQL
statement. This is sometimes called an ‘Upsert’ statement
This can also be performed using Rowsets and Arrays. Testing results have often shown that
MERGE is way more efficient than
- SELECT followed by INSERT or UPDATE and INSERT
- INSERT first (if duplicate SQLCODE -803), then UPDATE.
The size of the arrays has not shown much difference as it does in fetching. Executing multiple
MERGE's with arrays of 100 or 1000 at a time versus fewer with executions with 10,000 had
little differences if the number of executions was not dramatically different. Of course you
need to do your own independent testing.
45
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
When checking for multiple conditions, you must code a SELECT for the USING. VALUES does
not work.
46
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
- Fetch the ROWID (RID ) for each row being processed in the
‘Read Only’ cursor, and execute all update or delete statements using the
ROWID/RID value in place of the key fields for better performance.
Allows the developer to have a cursor that might be ‘Read Only’ and still have the option to
execute updates ‘where current of cursor’
Update EMP
Set Salary = Salary * 1.1
Where Rid(EMP) = ……
47
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
When coding outer join logic, it does not matter whether the developer codes a
‘Left Outer Join’ or a ‘Right Outer Join’ in order to get the logic correct, as long
as they have the starting ‘Driver’ table coded correctly. There is no difference
between a Left and Right outer join other than where the starting ‘Driver’ is
coded. This is not really a tuning tip, but rather a tip to help all developers
understand that left outer joins are more readable.
Developers in Db2 should only code ‘Left Outer Joins’. It is more straight
forward because the starting ‘Driver’ table is always coded first, and all
subsequent tables being joined to have ‘Left Outer Join’ coded beside them,
making it more understandable and readable
Db2 optimization always converts right outer joins to left outer joins. See explain output.
48
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
should be rewritten as
49
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
should be rewritten as
should be rewritten as
51
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
If the optimizer knows exactly what you intend to retrieve it can make decisions based on
that fact, and often times optimization will be different based on this known fact than if it
was not coded, and the program just quit processing after the first 25.
52
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
28. Take advantage and promote ‘Index Only’ processing whenever possible.
No need for Db2 to leave the index file because everything it needs to process the result set
is contained within the index file.
Often times, columns are added to an index specifically to get ‘Index Only’ processing.
53
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
1) When there are multiple subqueries of the same type, always code in the
order of most restrictive to least restrictive.
54
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
AS OF V11, TRANSITIVE CLOSURE TAKES PLACE FOR ALL PREDICATES EXCEPT ‘LIKE’. SO DEVELOPERS SHOULD CODE
THEIR OWN TRANSITIVE CLOSURE TO PROVIDE THE OPTIMIZER MORE INFORMATION.
55
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
56
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
If you do this ….
1) Get developers educated in SQL programming
2) Get developers educated in Db2 explains
3) Get developers educated in data statistics
4) Have a set of SQL Standards and Guidelines and enforce them
5) Have code walkthroughs
6) Document predicate rewrite examples
7) Document query rewrite examples
57
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
58
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Tony Andrews
Themis Inc.
tandrews@[Link]
59
59