If you’re running SQL Server with Always On Availability Groups, you’ve likely come across the PARALLEL_REDO_FLOW_CONTROL wait type. It shows up when your secondary replica’s redo thread is being throttled ā essentially pausing to keep things balanced and prevent overloading the system.
But when should you worry about it? And what can you do about it? Letās walk through it.
š What is PARALLEL_REDO_FLOW_CONTROL?
This wait type happens when SQL Serverās redo thread on a secondary replica is intentionally paused to avoid overwhelming I/O or CPU resources. While itās normal to see some of this, high or growing wait times could point to:
- Slow disk on the secondary
- CPU pressure
- A redo queue backlog
- Poor Always On architecture design (e.g., misaligned replica roles or slow replication networks)
š Query to Monitor It
Run this to check your current wait stats:
SELECT
wait_type,
waiting_tasks_count,
wait_time_ms,
max_wait_time_ms,
signal_wait_time_ms,
CAST(signal_wait_time_ms * 100.0 / NULLIF(wait_time_ms,0) AS DECIMAL(5,2)) AS signal_wait_percent
FROM sys.dm_os_wait_stats
WHERE wait_type = 'PARALLEL_REDO_FLOW_CONTROL'
ORDER BY wait_time_ms DESC;
What to look at:
wait_time_ms: Total accumulated waitsignal_wait_time_ms: Time spent waiting for CPUsignal_wait_percent: A high percentage (over 30%) suggests CPU pressure
ā How to Interpret the Results
| Metric | Good | Concerning |
|---|---|---|
wait_time_ms | < 10,000 | > 100,000 over time |
max_wait_time_ms | < 1000 | > 5000 |
signal_wait_percent | < 5% | > 30% = look into CPU |
š How to Check Redo Queue Size
This query checks the redo queue size in KB:
SELECT
ag.name AS ag_name,
ar.replica_server_name,
db.database_name,
drs.redo_queue_size,
drs.redo_rate,
drs.redo_queue_size / NULLIF(drs.redo_rate, 0) AS estimated_seconds_to_clear
FROM sys.dm_hadr_database_replica_states drs
JOIN sys.availability_databases_cluster db ON db.group_id = drs.group_id AND db.database_id = drs.database_id
JOIN sys.availability_groups ag ON ag.group_id = drs.group_id
JOIN sys.availability_replicas ar ON ar.replica_id = drs.replica_id
WHERE drs.is_primary_replica = 0;
āļø Recommended Trace Flags
If youāre seeing consistently high redo waits, these trace flags can help enable more aggressive parallel redo:
| Trace Flag | Description |
|---|---|
9567 | Enables parallel redo |
7451 | Improves redo parallelism by relaxing blocking rules |
3449 | Ensures redo starts immediately after crash recovery (SQL 2016+) |
To enable:
DBCC TRACEON (9567, -1);
DBCC TRACEON (7451, -1);
DBCC TRACEON (3449, -1);
Caution: Test in a non-production environment first, and confirm with your DBA or Microsoft support if youāre on a critical system.
š§° Extra Tips
- Ensure secondary replicas have fast storage (ideally SSDs or Azure Premium Disks)
- Monitor CPU usage ā especially if signal waits are creeping up
- Check that replicas are connected with low-latency networking
- Donāt overcommit the replicaās resources ā Always On is not “free”
š§¾ Final Thoughts
A little PARALLEL_REDO_FLOW_CONTROL is normal. But if it’s consistently high, itās a sign your secondary canāt keep up. The good news is: with the right monitoring, tuning, and maybe a few trace flags, you can bring things under control.
Discover more from SQLYARD
Subscribe to get the latest posts sent to your email.


