本文适用:AZURE SQL数据库,AZURE SQL托管实例
如果把数据库性能比作一名学生,而各项围绕性能的参数比作各门课程的考试成绩,那联想木桶原理,我们不禁会问,哪门课程是该学生最薄弱的环节(哪项参数是数据库性能的短板?)?显而易见,作为整个计算机体系中的磁盘存储的速度是非常容易拉夸的,即数据库的IO方面对于性能影响最为致命。
识别 IO 性能问题时,与 IO 问题最相关的等待类型包括:
-
PAGEIOLATCH_*
数据文件 IO 问题(包括
PAGEIOLATCH_SH
、PAGEIOLATCH_EX
、PAGEIOLATCH_UP
)。 如果等待类型名称中包含 IO,则此类型与某个 IO 问题相关。 如果页面闩锁等待名称中不包含 IO,则此类型与不同类型的问题(例如 tempdb 争用)相关。 -
WRITE_LOG
事务日志 IO 问题。
如果目前已经出现了 IO 问题
使用 sys.dm_exec_requests 或 sys.dm_os_waiting_tasks 查看 wait_type
和 wait_time
。
select * from sys.dm_exec_requests
select * from sys.dm_os_waiting_tasks
识别数据和日志 IO 用量
使用以下查询来识别数据和日志 IO 用量。 如果数据或日志 IO 用量超过 80%,则表示用户对 SQL 数据库服务层级使用了可用 IO。
SELECT end_time, avg_data_io_percent, avg_log_write_percent
FROM sys.dm_db_resource_stats
ORDER BY end_time DESC;
如果已达到 IO 限制,可使用两种选项:
- 选项 1:升级计算大小或服务层级
- 选项 2:识别并优化 IO 消耗量最大的查询。
使用查询存储查看缓冲区相关的 IO
对于选项 2,可以针对缓冲区相关 IO 的查询存储使用以下查询,以查看过去两个小时的跟踪活动:
-- top queries that waited on buffer
-- note these are finished queries
WITH Aggregated AS (SELECT q.query_hash, SUM(total_query_wait_time_ms) total_wait_time_ms, SUM(total_query_wait_time_ms / avg_query_wait_time_ms) AS total_executions, MIN(qt.query_sql_text) AS sampled_query_text, MIN(wait_category_desc) AS wait_category_desc
FROM sys.query_store_query_text AS qt
JOIN sys.query_store_query AS q ON qt.query_text_id=q.query_text_id
JOIN sys.query_store_plan AS p ON q.query_id=p.query_id
JOIN sys.query_store_wait_stats AS waits ON waits.plan_id=p.plan_id
JOIN sys.query_store_runtime_stats_interval AS rsi ON rsi.runtime_stats_interval_id=waits.runtime_stats_interval_id
WHERE wait_category_desc='Buffer IO' AND rsi.start_time>=DATEADD(HOUR, -2, GETUTCDATE())
GROUP BY q.query_hash), Ordered AS (SELECT query_hash, total_executions, total_wait_time_ms, sampled_query_text, wait_category_desc, ROW_NUMBER() OVER (ORDER BY total_wait_time_ms DESC, query_hash ASC) AS RN
FROM Aggregated)
SELECT OD.query_hash, OD.total_executions, OD.total_wait_time_ms, OD.sampled_query_text, OD.wait_category_desc, OD.RN
FROM Ordered AS OD
WHERE OD.RN<=15
ORDER BY total_wait_time_ms DESC;
GO
主要参数说明 :
1. query_hash :缓存中HASH ID(同前文)。
2. total_executions:总执行次数。
3. total_wait_time_ms:总执行时间。
4. sampled_query_text:SQL文本。
5. wait_category_desc :等待类型描述。
查看 WRITELOG 等待类型的日志 IO 总计
如果等待类型为 WRITELOG
,请使用以下查询按语句查看日志 IO 总计:
-- Top transaction log consumers
-- Adjust the time window by changing
-- rsi.start_time >= DATEADD(hour, -2, GETUTCDATE())
WITH AggregatedLogUsed
AS (SELECT q.query_hash,
SUM(count_executions * avg_cpu_time / 1000.0) AS total_cpu_millisec,
SUM(count_executions * avg_cpu_time / 1000.0) / SUM(count_executions) AS avg_cpu_millisec,
SUM(count_executions * avg_log_bytes_used) AS total_log_bytes_used,
MAX(rs.max_cpu_time / 1000.00) AS max_cpu_millisec,
MAX(max_logical_io_reads) max_logical_reads,
COUNT(DISTINCT p.plan_id) AS number_of_distinct_plans,
COUNT(DISTINCT p.query_id) AS number_of_distinct_query_ids,
SUM( CASE
WHEN rs.execution_type_desc = 'Aborted' THEN
count_executions
ELSE
0
END
) AS Aborted_Execution_Count,
SUM( CASE
WHEN rs.execution_type_desc = 'Regular' THEN
count_executions
ELSE
0
END
) AS Regular_Execution_Count,
SUM( CASE
WHEN rs.execution_type_desc = 'Exception' THEN
count_executions
ELSE
0
END
) AS Exception_Execution_Count,
SUM(count_executions) AS total_executions,
MIN(qt.query_sql_text) AS sampled_query_text
FROM sys.query_store_query_text AS qt
JOIN sys.query_store_query AS q
ON qt.query_text_id = q.query_text_id
JOIN sys.query_store_plan AS p
ON q.query_id = p.query_id
JOIN sys.query_store_runtime_stats AS rs
ON rs.plan_id = p.plan_id
JOIN sys.query_store_runtime_stats_interval AS rsi
ON rsi.runtime_stats_interval_id = rs.runtime_stats_interval_id
WHERE rs.execution_type_desc IN ( 'Regular', 'Aborted', 'Exception' )
AND rsi.start_time >= DATEADD(HOUR, -2, GETUTCDATE())
GROUP BY q.query_hash),
OrderedLogUsed
AS (SELECT query_hash,
total_log_bytes_used,
number_of_distinct_plans,
number_of_distinct_query_ids,
total_executions,
Aborted_Execution_Count,
Regular_Execution_Count,
Exception_Execution_Count,
sampled_query_text,
ROW_NUMBER() OVER (ORDER BY total_log_bytes_used DESC, query_hash ASC) AS RN
FROM AggregatedLogUsed)
SELECT OD.total_log_bytes_used,
OD.number_of_distinct_plans,
OD.number_of_distinct_query_ids,
OD.total_executions,
OD.Aborted_Execution_Count,
OD.Regular_Execution_Count,
OD.Exception_Execution_Count,
OD.sampled_query_text,
OD.RN
FROM OrderedLogUsed AS OD
WHERE OD.RN <= 15
ORDER BY total_log_bytes_used DESC;
GO
主要字段说明:
1. total_log_bytes_used : 总日志使用大小。
2. number_of_distinct_plans:执行计划数量
3. number_of_distinct_query_ids:query_id数量
4. total_executions,、:总执行次数
5. Aborted_Execution_Count:已经终止执行次数。
6. Regular_Execution_Count:常规(正常)执行次数。
7. Exception_Execution_Count:异常执行次数。
8. sampled_query_text:简化SQL本体文本。