Automatic Plan Correction: How SQL Server Detects and Fixes Query Regressions

Automatic Plan Correction: How SQL Server Detects and Fixes Query Regressions – SQLYARD

Automatic Plan Correction: How SQL Server Detects and Fixes Query Regressions


Query plan regressions are one of the most frustrating performance problems a DBA deals with. Everything is running fine, then statistics update, a recompile fires, the optimizer picks a different plan, and suddenly a query that ran in 200 milliseconds takes 45 seconds. The root cause can take hours to track down through Query Store, and by the time you find it the business has already felt the impact.

Automatic Plan Correction (APC) is SQL Server’s answer to this problem. It monitors query plans through Query Store, detects when a new plan performs significantly worse than a previously known good plan, and automatically forces the good plan back without DBA intervention. It has been available since SQL Server 2017, but the regression detection logic underneath it has been significantly improved in SQL Server 2022 CU4 and SQL Server 2025. This article explains how it works, what changed, and how to configure and monitor it on your instances.

Prerequisite: Automatic Plan Correction requires Query Store to be enabled on the database. APC uses Query Store data to compare plan performance. If Query Store is not running, APC has no data to work with. Confirm with SELECT is_query_store_on FROM sys.databases WHERE name = DB_NAME().

1 What Automatic Plan Correction Does Beginner

When the query optimizer compiles a query it picks a plan based on statistics and cost estimates. If statistics change or the data distribution shifts, a recompile may produce a different plan that is actually worse than the previous one. This is a plan regression.

APC watches for this pattern by continuously comparing the CPU performance of the current execution plan against the last known good plan for the same query, using data captured by Query Store. When it determines a statistically significant regression has occurred, it forces the previous good plan and logs the action. If that forced plan later stops being the best option, APC can release the force and re-evaluate.

APC is part of the Automatic Tuning feature family and is controlled at the database level. Enabling it is a single command:

-- Enable Automatic Plan Correction on a database
ALTER DATABASE [YourDatabase]
SET AUTOMATIC_TUNING (FORCE_LAST_GOOD_PLAN = ON);

-- Confirm the current state
SELECT
    name,
    desired_state_desc,
    actual_state_desc,
    reason_desc
FROM sys.database_automatic_tuning_options
WHERE name = 'FORCE_LAST_GOOD_PLAN';

APC is recommendation-only by default on some editions. On SQL Server Standard and Express editions, APC may log recommendations without automatically applying them. Verify that actual_state_desc shows ON after enabling, not OFF with a reason like QUERY_STORE_OFF or NOT_SUPPORTED.

2 How APC Detects a Regression: The Original Sigma Model Beginner

Before SQL Server 2022 CU4, APC used a sigma-based comparison to decide whether a regression had occurred. The method compares the average CPU time of the current plan against the last known good plan. Specifically, it checks whether the difference in mean CPU time exceeds three standard deviations, which is the basis of the commonly referenced “three-sigma rule.”

The logic is: if the current plan’s average CPU time is more than three standard deviations above the baseline plan’s mean, the probability that this difference is just random noise is very low, and APC concludes a regression has occurred and forces the good plan back.

This model requires a minimum of 15 executions of the current plan before it makes any determination. It collects enough execution data to compute a meaningful average and standard deviation, then applies the threshold test once.

3 The Limitations of Sigma-Based Detection Intermediate

The sigma-based model works well when execution times are consistent and normally distributed, but real production workloads often are not. Three specific problems emerge in practice.

High Variance Masks Regressions

If a query’s execution time is naturally variable, the standard deviation is large. A plan that doubles average CPU time may still fall within the three-sigma band simply because the baseline was noisy. APC sees no regression even though performance has clearly degraded.

Equal Variance Assumption

The sigma model treats the variance of the current plan and the baseline plan as comparable. When a good plan ran consistently at 50ms and the regressed plan swings between 10ms and 8 seconds, the comparison loses precision because the spread of the two populations is fundamentally different. The model was not designed for that scenario.

Slow Reaction Time

Waiting for 15 executions before making a determination means APC is slow to react. On queries that run infrequently or on systems where regressions are severe and immediate, 15 executions of a bad plan causes real user impact before APC even begins its evaluation.

4 The Welch’s t-Test Model: What Changed in SQL Server 2022 CU4 and 2025 Intermediate

Starting with SQL Server 2022 CU4 and enabled by default in SQL Server 2025, APC can use a different statistical model called the Welch’s t-test. It is a well-established hypothesis test specifically designed for comparing two populations that may have unequal variances and unequal sample sizes, which describes exactly the situation APC faces when comparing a regressed plan against a known good plan.

The practical framing for a DBA: instead of asking “is the mean CPU time more than three standard deviations higher?”, the Welch’s t-test asks “given the execution statistics from both plans, are these two populations genuinely different, or is the difference likely just noise?” It accounts for how variable each plan is independently rather than assuming they behave similarly.

Key Improvements

  • Starts evaluating after 3 executions, not 15. APC can make an initial determination much sooner. When the performance difference is large and statistically clear, it can detect and correct a regression after only 3 executions of the new plan instead of waiting for 15.
  • Adaptive sample sizing. If 3 executions produce an uncertain result, APC does not give up. It progressively increases the required execution count and re-evaluates as more data arrives, using an exponential backoff approach until confidence is reached.
  • Multiple re-evaluations. Rather than a single check after the minimum execution count is met, the Welch’s t-test model re-evaluates multiple times as new execution statistics accumulate. This reduces both false positives (declaring a regression that is not real) and false negatives (missing a genuine regression).
  • Handles unequal variance correctly. A plan that sometimes runs fast and sometimes runs very slow is no longer able to hide a genuine performance shift. The model explicitly accounts for different spread in the two plan populations.
Aspect Original Model (Sigma) New Model (Welch’s t-Test)
Statistical method 3-sigma threshold on mean CPU difference Welch’s t-test (unequal variance t-test)
Handles unequal variance No, assumes similar variance between plans Yes, explicitly models different variances
Minimum executions to evaluate 15 executions of current plan As few as 3, adapts progressively
Adaptive sample sizing No Yes, increases required executions on uncertain results
Multiple re-evaluations Single check only Multiple checks as data accumulates
Default on Azure SQL / Managed Instance No, sigma is the fallback Yes, enabled by default
Default on SQL Server 2025 No Yes, enabled by default
Available on SQL Server 2022 Default when TF 12618 is not set Requires trace flag 12618 starting at CU4

5 Enabling the Welch’s t-Test Model Beginner

The Welch’s t-test model is active by default in SQL Server 2025, Azure SQL Database, SQL Database in Microsoft Fabric, and Azure SQL Managed Instance. No action is required on those platforms.

On SQL Server 2022 CU4 and later, enable it with trace flag 12618. Make sure you are on CU4 or later before enabling this flag as it is not available on earlier cumulative updates.

-- Confirm SQL Server 2022 version and CU level before enabling
SELECT @@VERSION;
-- Must be SQL Server 2022 CU4 or later

-- Enable globally for the current session (testing):
DBCC TRACEON (12618, -1);

-- Verify trace flag is active:
DBCC TRACESTATUS (12618);

-- Enable as a startup parameter for persistence across restarts:
-- Add -T12618 to the SQL Server service startup parameters in
-- SQL Server Configuration Manager under the Advanced tab

Apply the latest Cumulative Update. Both trace flags covered in this article (12618 and 12656) require SQL Server 2022 CU4 as a minimum. Staying current on cumulative updates gives you access to these improvements and the full Intelligent Query Processing feature set. Check your current CU with SELECT @@VERSION and compare against the latest release on the Microsoft Update Catalog.

6 The Blind Spot: Long-Running and Timed-Out Queries Intermediate

Both regression detection models share a limitation worth understanding clearly. APC evaluates plans after a certain number of query executions complete and report statistics back to Query Store. This means APC sees the fastest-finishing executions first.

Here is the scenario that exposes the problem. Server CPU utilization goes from 10% to 100% unexpectedly. You investigate and find a plan change occurred for a specific query. Extended Events show the query executed 15 times quickly, completing in milliseconds each time. But there were also many other executions running for minutes, some timing out and never completing, others finishing late.

By the time those slow executions finish and their statistics reach Query Store, APC has already evaluated the query based on the fast-completing executions and concluded there was no regression. The fast executions looked fine statistically. The slow and timed-out executions told the real story but arrived too late to influence the decision.

This is the scenario where APC can miss a severe regression entirely. If a plan change causes most executions to time out but a small percentage complete quickly, the quickly-completing executions are what APC uses for its evaluation. The timed-out executions only appear in Query Store after they finish or after the timeout threshold, by which point the regression decision has already been made.

7 Trace Flag 12656 and the Per-Query Extended Check Intermediate

Trace flag 12656 addresses the long-running query blind spot by enabling a time-based delayed recheck. When this flag is active, APC schedules an additional evaluation 5 minutes after a plan change is first detected. This second check uses execution statistics that now include the slower executions and any timed-out queries that have since reported back to Query Store. It gives APC a much more complete picture of what the plan change actually did to the workload.

-- Enable delayed recheck globally (SQL Server 2022 CU4 and later)
DBCC TRACEON (12656, -1);

-- Both trace flags together: Welch's t-test model + delayed recheck
DBCC TRACEON (12618, -1);
DBCC TRACEON (12656, -1);

-- Verify both are active:
DBCC TRACESTATUS (12618);
DBCC TRACESTATUS (12656);

Per-Query Control with sp_configure_automatic_tuning

The global trace flag applies the delayed recheck to every query on the instance. If you want more surgical control, the sp_configure_automatic_tuning stored procedure lets you enable the same 5-minute delayed recheck on individual queries by query ID. This is also the only option on Azure SQL Database and Managed Instance where global trace flags are not available.

-- Enable the delayed recheck for a specific query (identified by query_id from Query Store)
EXEC sp_configure_automatic_tuning
    @option       = 'FORCE_LAST_GOOD_PLAN_EXTENDED_CHECK',
    @option_value = 'ON',
    @type         = 'QUERY',
    @type_value   = 1234;    -- replace with your actual query_id

-- Disable the delayed recheck for a specific query
EXEC sp_configure_automatic_tuning
    @option       = 'FORCE_LAST_GOOD_PLAN_EXTENDED_CHECK',
    @option_value = 'OFF',
    @type         = 'QUERY',
    @type_value   = 1234;

-- Find query IDs for your slow queries in Query Store:
SELECT TOP 20
    q.query_id,
    q.query_hash,
    SUM(rs.avg_cpu_time * rs.count_executions) AS total_cpu,
    SUM(rs.count_executions)                   AS execution_count,
    qt.query_sql_text
FROM sys.query_store_query           q
JOIN sys.query_store_query_text      qt ON q.query_text_id = qt.query_text_id
JOIN sys.query_store_plan            p  ON q.query_id      = p.query_id
JOIN sys.query_store_runtime_stats   rs ON p.plan_id       = rs.plan_id
GROUP BY q.query_id, q.query_hash, qt.query_sql_text
ORDER BY total_cpu DESC;
Method Scope Where Available
Trace flag 12656 All queries on the instance SQL Server 2022 CU4 and later, SQL Server 2025
sp_configure_automatic_tuning FORCE_LAST_GOOD_PLAN_EXTENDED_CHECK Individual queries by query_id SQL Server 2022 CU4 and later, SQL Server 2025, Azure SQL Database, SQL Database in Fabric, Azure SQL Managed Instance

8 Monitoring APC with sys.dm_db_tuning_recommendations Intermediate

The sys.dm_db_tuning_recommendations DMV records every plan correction APC has made or considered making. It is populated whether APC is enabled or disabled, so you can review what APC would have done even if automatic forcing is turned off. The results are stored as JSON, so you need to parse them to get the useful columns.

Data in sys.dm_db_tuning_recommendations does not persist across restarts. The DMV is in-memory only. If you need a persistent record of APC decisions, capture the data on a schedule into a history table, or use Extended Events (Section 9) which write to files.

-- Query sys.dm_db_tuning_recommendations with JSON parsing
-- Returns one row per APC recommendation with plan performance details

SELECT
    JSON_VALUE([state], '$.currentValue')             AS recommendation_state,
    JSON_VALUE(details, '$.implementationDetails.script') AS fix_script,
    pfd.query_id,
    pfd.regressedPlanId,
    pfd.recommendedPlanId,
    pfd.regressedPlanErrorCount,
    pfd.recommendedPlanErrorCount,
    pfd.regressedPlanExecutionCount,
    pfd.regressedPlanCpuTimeAverage,
    pfd.recommendedPlanExecutionCount,
    pfd.recommendedPlanCpuTimeAverage,
    -- Calculate how much worse the regressed plan is
    ROUND(
        (pfd.regressedPlanCpuTimeAverage - pfd.recommendedPlanCpuTimeAverage)
        / NULLIF(pfd.recommendedPlanCpuTimeAverage, 0) * 100, 1
    )                                                 AS cpu_pct_worse
FROM sys.dm_db_tuning_recommendations
CROSS APPLY OPENJSON(Details, '$.planForceDetails')
    WITH (
        query_id                    INT,
        regressedPlanId             INT,
        recommendedPlanId           INT,
        regressedPlanErrorCount     INT,
        recommendedPlanErrorCount   INT,
        regressedPlanExecutionCount INT,
        regressedPlanCpuTimeAverage FLOAT,
        recommendedPlanExecutionCount INT,
        recommendedPlanCpuTimeAverage FLOAT
    ) AS pfd
ORDER BY regressedPlanCpuTimeAverage DESC;

The recommendation_state column tells you what APC did with each detection. Common values are Active (plan is currently forced), Verifying (APC is monitoring the forced plan to confirm it is better), Success (forced plan confirmed as better), and Expired (APC backed off because the forced plan stopped being beneficial).

9 Extended Events for Deeper Visibility Advanced

The DMV gives you a point-in-time snapshot. Extended Events give you a time-series record of every decision APC makes, including the statistical details behind each evaluation. This is the right tool for investigating why APC did or did not fire on a specific query, and for comparing the sigma model against the Welch’s t-test model on your workload.

Key APC Extended Events

EventWhen It Fires
automatic_tuning_plan_regression_detection_check_completed APC finishes evaluating a plan for regression. Shows whether a regression was detected and what action was taken.
automatic_tuning_plan_regression_verification_check_completed APC finishes validating whether a previously forced plan is still performing well.
automatic_tuning_check_abandoned APC abandons a check, for example because only one plan exists for the query.
automatic_tuning_recommendation_expired APC backs off from a forced plan that is no longer helping.
automatic_tuning_diagnostics Periodic health summary showing check counts, detection counts, and correction counts across all queries.
automatic_tuning_wtest_details Fires during Welch’s t-test evaluation with detailed statistical output. Useful for comparing old and new models. Not available in Azure SQL Database or SQL Database in Fabric.

The automatic_tuning_wtest_details event can be high volume. It fires during every Welch’s t-test evaluation with detailed statistical data. Useful for diagnostics and model comparison but consider the overhead before leaving it running in production. Use it for targeted investigation, then disable it when done.

-- Extended Event session: monitor APC regression detection and Welch's t-test details
-- Captures to file for persistent analysis across restarts

CREATE EVENT SESSION [APC_Monitor] ON SERVER
ADD EVENT qds.automatic_tuning_plan_regression_detection_check_completed (
    ACTION (sqlserver.database_id, sqlserver.database_name, sqlserver.sql_text)
),
ADD EVENT qds.automatic_tuning_plan_regression_verification_check_completed (
    ACTION (sqlserver.database_id, sqlserver.database_name)
),
ADD EVENT qds.automatic_tuning_check_abandoned (
    ACTION (sqlserver.database_id, sqlserver.database_name)
),
ADD EVENT qds.automatic_tuning_recommendation_expired (
    ACTION (sqlserver.database_id, sqlserver.database_name)
),
ADD EVENT qds.automatic_tuning_diagnostics (
    ACTION (sqlserver.database_id, sqlserver.database_name)
),
-- Include Welch's t-test details for SQL Server 2022/2025 (remove if on Azure SQL)
ADD EVENT qds.automatic_tuning_wtest_details (
    ACTION (sqlserver.database_id, sqlserver.database_name)
)
ADD TARGET package0.event_file (
    SET filename          = N'APC_Monitor.xel',
        max_file_size     = 50,
        max_rollover_files = 5
)
WITH (
    MAX_MEMORY               = 4096 KB,
    EVENT_RETENTION_MODE     = ALLOW_SINGLE_EVENT_LOSS,
    MAX_DISPATCH_LATENCY     = 5 SECONDS,
    STARTUP_STATE            = OFF     -- set to ON if you want it to start with the instance
);
GO

-- Start the session
ALTER EVENT SESSION [APC_Monitor] ON SERVER STATE = START;

-- Stop the session when done
-- ALTER EVENT SESSION [APC_Monitor] ON SERVER STATE = STOP;

-- Read the captured events:
SELECT
    event_data.value('(event/@name)[1]',               'varchar(100)') AS event_name,
    event_data.value('(event/@timestamp)[1]',           'datetime2')    AS event_time,
    event_data.value('(event/data[@name="database_name"]/value)[1]', 'varchar(200)') AS database_name,
    event_data.value('(event/data[@name="query_id"]/value)[1]',      'int')          AS query_id,
    event_data.value('(event/data[@name="regression_detected"]/value)[1]', 'varchar(10)') AS regression_detected
FROM (
    SELECT CAST(event_data AS XML) AS event_data
    FROM sys.fn_xe_file_target_read_file('APC_Monitor*.xel', NULL, NULL, NULL)
) AS events
ORDER BY event_time DESC;

10 Model Comparison Summary Beginner

The table below summarizes which model is active by default on each platform and what you need to do to get the improved behavior on SQL Server 2022.

PlatformDefault ModelTo Enable Welch’s t-Test
SQL Server 2025 Welch’s t-test (default) No action required
Azure SQL Database Welch’s t-test (default) No action required
SQL Database in Microsoft Fabric Welch’s t-test (default) No action required
Azure SQL Managed Instance Welch’s t-test (default) No action required
SQL Server 2022 CU4 and later Sigma (default without TF) DBCC TRACEON (12618, -1) or startup parameter -T12618
SQL Server 2022 before CU4 Sigma only Apply CU4 or later, then enable TF 12618
SQL Server 2019 and earlier Sigma only Not available on these versions

Recommended configuration for SQL Server 2022 CU4 and later: Enable both trace flags together. Trace flag 12618 activates the Welch’s t-test model for faster and more accurate regression detection. Trace flag 12656 adds the 5-minute delayed recheck that catches regressions showing up as timeouts rather than completed executions. Both are available as startup parameters for persistence across restarts: add -T12618 and -T12656 to the startup parameters in SQL Server Configuration Manager.

References


Discover more from SQLYARD

Subscribe to get the latest posts sent to your email.

Leave a Reply

Discover more from SQLYARD

Subscribe now to keep reading and get access to the full archive.

Continue reading