SQL Server 2016/2017: Availability group secondary replica redo model and performance

Microsoft

Mar 24, 2019

First published on MSDN on Jun 01, 2018

When availability group was initially released with SQL Server 2012, the transaction log redo was handled by a single redo thread for each database in an AG secondary replica. This redo model is also called as serial redo . In SQL Server 2016, the redo model was enhanced with multiple parallel redo worker threads per database to share the redo workload. In addition, each database has a new helper worker thread for handling the dirty page disk flush IO. This new redo model is called parallel redo . With the new parallel redo model that is the default setting since SQL Server 2016, workloads with highly concurrent small transactions are expected to achieve better redo performance. When the transaction redo operation is CPU intensive, such as when data encryption and/or data compression are enabled, parallel redo has even higher redo throughput ( Redone Bytes/sec ) compared to serial redo. Moreover, indirect checkpoint allows parallel redo to offload more disk IO (and IO waits for slow disk) to its helper worker thread and frees main redo thread to enumerate more received log records in secondary replica. It further speeds up the redo performance. However parallel redo, which enables multi-threading model, has an associated cost.

Although the main redo thread stops executing individual transaction log redo operation, it is responsible for enumerating and dispatching each transaction log to its parallel redo worker threads. The cost for dispatching these logs can be substantially high in scenarios where the log redo operation is not CPU intensive, such as redo of a DML transaction on a narrow table (with small row sizes) row.
System transactions for page splits caused by new record inserts can introduce the PARALLEL_REDO_TRAN_TURN waits across parallel redo worker threads when a secondary replica is configured as readable replica. Depending on the frequency of insert operations, this can significantly slow down parallel redo performance.
When read-only queries are running on a readable secondary replica, query threads attempt to apply pending log redo operations and need to collaborate with redo worker threads with DIRTY_PAGE_TABLE_LOCK waits, which can be frequently generated and slow down both redo and query performance if there are concurrent redo workloads. The performance issue associated with DIRTY_PAGE_TABLE_LOCK wait is addressed in the cumulative update release for SQL Server 2016 SP and SQL Server 2017 mentioned in this article .

Based on a current performance study, below types of transaction workload or SQL configuration will generally perform better with the default parallel redo model

High concurrent small size transactions.
Expensive log redo operation (such as data encryption or data compression) with indirect checkpoint
Non-readable secondary replica, or only occasional read-only queries to a readable secondary replica when there is busy transaction log redo traffic

For the following scenarios, switching to serial redo is expected to achieve better redo throughput:

Long running transaction(s) with predominant inserts and limited concurrency – a typical example is online index rebuild of a clustered index of a large table
- Symptoms in parallel redo model:
  - The frequency of PARALLEL_REDO_TRAN_TURN wait generally is proportional to the volume of insert operations. More inserts can trigger more page splits, which maps to more PARALLEL_REDO_TRAN_TURN waits.
  - Workload concurrency can be monitored by perf counter (Object – SQLServer:General Statistics , Counter – User Connections ) or DMV "sys.dm_exec_connection" & "sys.dm_exec_sessions" in primary replica
Frequent and/or long-running read-only queries need to be executed on a database of a secondary replica which has concurrent transaction log redo workloads. Query and redo do not have to run against the same set of tables in the database.
- Symptoms in parallel redo model:
  - Frequent DIRTY_PAGE_TABLE_LOCK waits
  - Monitor perf counter (Object – SQLServer:Database Replica , Instance – [DBName], Counter- Redone Bytes/sec ) to compare the redo throughput when query is running and not
A small set of data records whose data pages are scanned by high concurrent read-only queries on a secondary replica while frequent changes are made to the same set of data records on primary replica. Query and redo threads may dramatically slow down each other in this situation.
- Symptoms in parallel redo model:
  - High volume of DPT_ENTRY_LOCK waits
  - For the worst case, a number of error message "Timeout occurred while waiting for latch" are logged in SQL Error Log followed by a "Latch timeout" dump

To switch to serial redo model, TF 3459 needs to be enabled. But after the SQL Server instance is running in serial redo model, the only way to change it back to parallel redo is to restart the SQL Server service. Multiple factors impact redo performance, some of them are only applicable to the new redo model - parallel redo, such as PARALLEL_REDO_TRAN_TURN and DIRTY_PAGE_TABLE_LOCK waits. For different transaction workloads and hosted machine configurations, it is not always certain which subset of factors has stronger impact to redo performance. If the redo performance of your workload does not align with the explanation in this document, please share your workload and host machine details to Microsoft Customer Service and Support (CSS) . It will help us build a more complete view and evaluate future improvements with redo performance.

New wait types

A few new thread wait types were added with the new Parallel Redo model in SQL Server 2016. This information can be queried from sys.dm_os_wait_stats. Some of these wait types indicate performance impact, while the others can be considered benign.

Wait type with performance impact:

Type	Description	Comment
PARALLEL_REDO_FLOW_CONTROL	Occurs when the main redo thread cannot dispatch more transaction log records when log cache array for dispatched transaction log is full.	Indicates that one or more parallel redo worker threads cannot keep up with main redo thread transaction log dispatching speed or are blocked by some resources such as other type of waits. When this wait occurs frequently, parallel redo worker threads does not function efficiently.
PARALLEL_REDO_TRAN_TURN	Occurs when a parallel redo worker thread needs to wait for a dependent transaction log be redone before starting to redo the current transaction log. The dependent transaction log may be assigned to a different parallel redo worker, which means a cross-thread wait.	Only happens in a readable secondary replica when new insert triggers page-split system transaction, or record update in a heap table generates a forwarded record ( More Information on forwarded records ).
DIRTY_PAGE_TABLE_LOCK	Occurs when there is a wait on a lock that control access to dirty page table. Dirty page table contains information of modified data pages that need to be flushed to disk. With parallel redo model, dispatched transaction logs that are pending for redo are associated with dirty page entries.	When there are read-only queries in a readable secondary replica. Queries try to process pending transaction logs during data page scan. It means query threads need to collaborate with main redo thread and parallel redo work threads for this dirty page table lock access and can slow down both query and redo performance when they are running concurrently. This wait will not be generated anymore after the performance fix for concurrent read-only query and log redo is released.
DPT_ENTRY_LOCK	Occurs when there is a wait on a lock that control access to a dirty page entry which has redo to catch up from user query thread.	Only occurs when parallel redo worker thread and a user query thread concurrently process redo operations for the same dirty page entry.

Wait types without performance impact:

Type	Description	Comment
PARALLEL_REDO_WORKER_WAIT_WORK	Occurs when any parallel redo worker threads or redo helper thread has nothing to do.	It is an idle wait and expect to occur regularly. It indicates either main redo worker thread cannot dispatch transaction logs as fast as the total redo speed of all parallel redo workers, or more commonly, the corresponding database in AG primary replica is idle and does not generate many transaction logs for secondary replica to redo.
PARALLEL_REDO_DRAIN_WORKER	Occurs when main redo thread needs to drain out all outstanding redo operations, such as when redoing file size change or checkpoint, etc.	Expect to occur regularly
PARALLEL_REDO_LOG_CACHE	Occurs when there is a wait on a lock that controls update to log cache array from parallel redo worker threads.	Infrequent waits. Occurs when parallel redo worker threads help signal redo steps after a flow control happens, most time main redo thread does it without lock wait.
PARALLEL_REDO_TRAN_LIST	Occurs when there is a wait on a lock that controls access to a special cache. The cache contains information about transactions that are fully redone and ready for release.	Infrequent waits. Occurs when main redo worker needs to access this transaction list when draining outstanding redo operations. Most time the helper worker access the transaction list without lock wait.
PARALLEL_REDO_WORKER_SYNC	Occurs when waiting for all parallel redo worker threads to stop.	When parallel redo threads are not successfully created or are stopped (such as enabling TF 3459)

Parallel redo thread usage and redo model control

Parallel redo thread usage is well covered in "Thread usage by Availability Groups" here .

A SQL Server instance uses up to 100 threads for parallel redo for secondary replicas. Each database uses up to one-half of the total number of CPU cores, but not more than 16 threads per database. If the total number of required threads for a single instance exceeds 100, SQL Server uses a single redo thread for every remaining database.

When the host server has 32 or more CPU cores, each database will occupy 16 parallel redo worker threads and one helper worker thread. It means that up to 6 availability group databases can be in parallel redo, and other availability group databases will be in single thread redo or serial redo irrespective which database has actual redo workload. Because that multiple databases are started up in parallel, there is no deterministic sequence to decide which 6 databases will be in parallel redo after a SQL Server service startup or an availability group failover. Again, in SQL Server instance level, the way to switch between parallel redo and serial redo is the TF 3459. All databases in the same SQL Server instance will be switched together. Also, to switch from serial redo to parallel redo by disable TF 3459, a SQL Server service restart is required (not required anymore since the cumulative update release for SQL Server 2016 SP and SQL Server 2017 mentioned in this article ).

Dong Cao
Principal Software Engineer
Microsoft Azure Data Group

Updated Oct 18, 2019

Version 4.0

SQLServerTiger

dongcao

Microsoft

Joined February 28, 2019

View Profile

SQL Server Blog

Follow this blog board to get notified when there's new activity

4 Comments

BPODFWPonder
Copper Contributor
Sep 19, 2023
I ran into the same issue and have SQL 2019 (40 core) on Win 2022. When I turned on TF 3459 on the secondaries it seemed that the parallel redo threads would not release on the small databases. It made me a little peeved and I thought it did not work due to TF 8002 on these servers to limit CPU or maybe CU version CU18. I have other servers in another site which are Win 2019 and SQL 2019 (32 core CU18 plus Security update) and it worked there but still some smaller databases seemed to hold onto the parallel threads. So I thought about it and the smaller databases were not always active and so did not send data across the AG so they were static for the most part. So I decided to try and create a dummy table in one of these databases and then drop the table, that would send traffic across the AG for that database. Once I did that when I had TF 3459 enabled that database dropped it's parallel threads and became single thread redo playback. So I was able to get all the small non-active databases to release their parallel threads by creating and dropping a dummy table in each database with the issue on the primary replica for that database I then disabled TF 3459 and when I checked the large active databases that I wanted to have parallel threads now had them.

So a partial solution to the issue in SQL 2019 (maybe 2016 and 2017 with updates mentioned in the article) you can turn on and off TF 3459 without a SQL Server service restart on SQL 2019 CU18. That means turn on TF 3459 and set to single thread for all databases. Then in on the primary for the affected databases that will not release their parallel threads create and delete a dummy table if there is no AG traffic otherwise it will not release the parallel threads from the database on secondary. If there is traffic on the AG for all databases then disable TF 3459 and you should see the most active databases pick up parallel redo threads.

Hope this helps people with SQL 2016, 2017 and 2019 with the 100 thread parallel redo issues. On SQL 2022 the issue is fixed with a different algorithm that takes load into account. On my servers it seemed that it databases with traffic picked up the parallel redo threads but not always the way I wanted if there were times when traffic was down the small databases again could pick up the parallel threads so there may be process needed to find the right time or generate data to make a database more active.

Thanks Brian
sajidur07
Copper Contributor
Jun 22, 2021
cn hi, We are facing same issue with 1TB ram and 56 core processors. I didn't even find Microsoft support systems email. Can you pls give me their support systems address?
cn
Copper Contributor
May 22, 2021
jmilner, I have exactly the same issue with 2019. It’s ridiculous. I am going to raise a support ticket.
jmilner
Copper Contributor
May 21, 2020
Regarding the assignment of parallel or single redo to databases, is this still the case in SQL 2019?

In our AG, we have 24 cores and 13 databases. So 8 of them get parallel redo (12 parallel redo threads per database * 8 databases = 96 of the available 100 threads) and the other 5 get single. In the instance we have a mix of very and active databases and small ones. Typically the small ones get parallel redo, persumably because they start up faster.

To work around this we drop the five smallest databases from the AG, restart the replica to get the 8 remaining databases parallel redo, then add those five databases back.

Having the ability to specify the number of parallel redo threads per database would address this for us.

Blog Post

SQL Server 2016/2017: Availability group secondary replica redo model and performance