SQL Server Missing Index Analysis: Per-Database and Server-Wide Scripts

SQL Server Missing Index Analysis: Per-Database and Server-Wide Scripts – SQLYARD

SQL Server Missing Index Analysis: Per-Database and Server-Wide Scripts


SQL Server’s missing index DMVs record every time the query optimizer compiled a plan and identified an index that would have improved performance. The engine tracks the estimated impact of each missing index suggestion, accumulating statistics across all query executions since the last service restart. These DMVs are one of the fastest ways to identify high-value indexing opportunities on a running production server.

This article covers how the missing index DMVs work, their limitations, the per-database analysis script, two server-wide options for multi-database environments, and the safe workflow for acting on the results.

Related SQLYARD articles: Missing index DMVs are one input to index tuning. For the complete index tuning methodology see the Index Tuning Guide. For a persistent logging framework that captures missing index suggestions over time and survives restarts see the Server-Wide Index Usage and Missing Index Logging article.

1 How the Missing Index DMVs Work Beginner

Every time SQL Server compiles an execution plan, the query optimizer evaluates whether an index that does not currently exist would have produced a better plan. When it identifies such an index, it records the suggestion in three DMVs that work together:

  • sys.dm_db_missing_index_details: the table, schema, and column details of each suggested index including equality columns, inequality columns, and included columns.
  • sys.dm_db_missing_index_groups: groups related missing index suggestions together. One group can contain multiple index suggestions for the same query.
  • sys.dm_db_missing_index_group_stats: the accumulated statistics for each group: seek counts, scan counts, average user impact percentage, and average total user cost. This is where the impact data lives.

The data in these DMVs accumulates from the moment SQL Server starts until it restarts. A server that has been running for 90 days under production load contains 90 days of accumulated missing index intelligence. A server that restarted six hours ago has only six hours of data. Always check server uptime before interpreting missing index results.

-- Check SQL Server uptime before interpreting missing index data
SELECT
    sqlserver_start_time,
    DATEDIFF(DAY, sqlserver_start_time, GETDATE())  AS UptimeDays,
    DATEDIFF(HOUR, sqlserver_start_time, GETDATE()) AS UptimeHours
FROM sys.dm_os_sys_info;

2 Understanding the Impact Factor Beginner

The scripts in this article calculate an ImpactFactor for each missing index suggestion. This is not a Microsoft-defined metric. It is a calculated composite score used widely in the SQL Server community to rank suggestions by their combined frequency and estimated benefit.

ImpactFactor = (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans)
ComponentWhat It MeasuresWhat High Values Mean
avg_total_user_costAverage estimated cost of the query plan without the indexThe queries affected are expensive
avg_user_impactEstimated percentage improvement if the index existed (0-100)The index would make a large difference
user_seeks + user_scansTotal number of times queries triggered this suggestionThe need is frequent, not a one-off

A suggestion with ImpactFactor of 50,000 means the optimizer has repeatedly encountered an expensive query where the missing index would provide significant improvement. A suggestion with ImpactFactor of 10 may be a one-time query that ran once and never will again. Sort by ImpactFactor descending and focus investigation on the top of the list.

3 Critical Limitations to Understand Before Acting Beginner

Missing index suggestions are hints from a single-query perspective. Never create an index directly from this output without validation. The optimizer generates each suggestion by looking at one query in isolation. It does not consider the full workload, existing indexes that could be modified to cover the need, write overhead the new index adds to every INSERT, UPDATE, and DELETE, or whether two suggestions could be merged into one index with an additional include column. Blindly creating every suggestion produces index sprawl that degrades write performance and wastes storage.

  • Data resets on every restart. Missing index DMVs are entirely in-memory. A failover, patch, or restart wipes all history. Low uptime produces misleading results.
  • No knowledge of existing indexes. The suggestion may already be partially covered by an existing index that needs a column added rather than a new index created.
  • No cross-workload awareness. Two suggestions on the same table may be served by one composite index. Creating both wastes space and doubles write overhead.
  • Estimated impact is not measured impact. The avg_user_impact percentage is the optimizer’s estimate at compile time. Actual improvement after creation may differ, especially with parameter sniffing or skewed data distributions.
  • Generated index names may conflict. The CREATE INDEX statement generated by the script uses a naming pattern based on table and column names. Always check for existing indexes with the same name before executing.

4 Script 1: Per-Database Missing Index Analysis Beginner

Run this script while connected to the specific database to analyze. It returns the top 500 missing index suggestions for that database ranked by ImpactFactor, with a ready-to-review CREATE INDEX statement for each suggestion.

-- ============================================================
-- Missing Index Analysis: Current Database
-- Returns top 500 suggestions ranked by ImpactFactor
-- Review each CREATE INDEX statement before executing
-- ============================================================
SELECT
    CAST(
        (migs.avg_total_user_cost * migs.avg_user_impact)
        * (migs.user_seeks + migs.user_scans)
    AS DECIMAL(20,3))                                           AS ImpactFactor,

    mid.statement                                               AS TableName,

    -- Ready-to-review CREATE INDEX statement with clean naming
    N'CREATE NONCLUSTERED INDEX ix_'
        + sys.objects.name COLLATE DATABASE_DEFAULT
        + N'_'
        + REPLACE(REPLACE(REPLACE(
            ISNULL(mid.equality_columns,   N'')
            + ISNULL(mid.inequality_columns, N''),
            N'[', N''), N']', N''), N', ', N'_')
        + N' ON ' + mid.[statement]
        + N' ( '
        + ISNULL(mid.equality_columns, N'')
        + CASE
            WHEN mid.inequality_columns IS NULL THEN N''
            ELSE
                CASE WHEN mid.equality_columns IS NULL THEN N'' ELSE N', ' END
                + mid.inequality_columns
          END
        + N' )'
        + CASE
            WHEN mid.included_columns IS NULL THEN N''
            ELSE N' INCLUDE (' + mid.included_columns + N')'
          END
        + N';'                                                  AS CreateIndexStatement,

    migs.avg_user_impact                                        AS [UserImprovement(%)],
    migs.avg_system_impact                                      AS [SystemImprovement(%)],
    migs.user_seeks                                             AS UserSeeks,
    migs.user_scans                                             AS UserScans,
    migs.system_seeks                                           AS SystemSeeks,
    migs.system_scans                                           AS SystemScans,
    mid.equality_columns                                        AS EqualityColumns,
    mid.inequality_columns                                      AS InequalityColumns,
    mid.included_columns                                        AS IncludedColumns

FROM sys.dm_db_missing_index_group_stats                        migs
INNER JOIN sys.dm_db_missing_index_groups                       mig
    ON migs.group_handle = mig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details                      mid
    ON mig.index_handle = mid.index_handle
INNER JOIN sys.objects WITH (NOLOCK)
    ON mid.object_id = sys.objects.object_id

WHERE migs.group_handle IN (
    SELECT TOP 500 group_handle
    FROM sys.dm_db_missing_index_group_stats WITH (NOLOCK)
    ORDER BY (avg_total_user_cost * avg_user_impact)
             * (user_seeks + user_scans) DESC
)
AND OBJECTPROPERTY(sys.objects.object_id, 'IsUserTable') = 1

ORDER BY ImpactFactor DESC, CreateIndexStatement DESC;

5 Script 2: Server-Wide Using sp_MSforeachdb Intermediate

sp_MSforeachdb is an undocumented system stored procedure that iterates through all databases on an instance and executes a T-SQL command in each one. It is convenient for quick ad-hoc analysis across many databases from a single execution.

sp_MSforeachdb is undocumented and has a known reliability issue. It can silently skip databases under certain conditions, particularly when database names contain special characters or when the instance has many databases. For a quick manual sweep it is acceptable. For scheduled jobs or scripted analysis where completeness matters, use Script 3 (the cursor version) instead.

-- ============================================================
-- Missing Index Analysis: All Databases via sp_MSforeachdb
-- Quick option for manual ad-hoc analysis
-- Note: may silently skip databases in edge cases
-- ============================================================

-- Create a staging table to collect results from all databases
IF OBJECT_ID('tempdb..#MissingIndexesAllDbs', 'U') IS NOT NULL
    DROP TABLE #MissingIndexesAllDbs;

CREATE TABLE #MissingIndexesAllDbs
(
    DatabaseName            NVARCHAR(128),
    ImpactFactor            DECIMAL(20,3),
    TableName               NVARCHAR(600),
    CreateIndexStatement    NVARCHAR(MAX),
    [UserImprovement(%)]    FLOAT,
    [SystemImprovement(%)]  FLOAT,
    UserSeeks               BIGINT,
    UserScans               BIGINT,
    SystemSeeks             BIGINT,
    SystemScans             BIGINT,
    EqualityColumns         NVARCHAR(4000),
    InequalityColumns       NVARCHAR(4000),
    IncludedColumns         NVARCHAR(4000)
);

EXEC sp_MSforeachdb '
USE [?];
IF DB_ID() > 4  -- skip system databases
BEGIN
    INSERT INTO #MissingIndexesAllDbs
    SELECT
        DB_NAME()                                               AS DatabaseName,
        CAST(
            (migs.avg_total_user_cost * migs.avg_user_impact)
            * (migs.user_seeks + migs.user_scans)
        AS DECIMAL(20,3))                                       AS ImpactFactor,
        mid.statement                                           AS TableName,
        N''CREATE NONCLUSTERED INDEX ix_''
            + sys.objects.name COLLATE DATABASE_DEFAULT
            + N''_''
            + REPLACE(REPLACE(REPLACE(
                ISNULL(mid.equality_columns,   N'''')
                + ISNULL(mid.inequality_columns, N''''),
                N''['', N''''), N'']'', N''''), N'', '', N''_'')
            + N'' ON '' + mid.[statement]
            + N'' ( ''
            + ISNULL(mid.equality_columns, N'''')
            + CASE
                WHEN mid.inequality_columns IS NULL THEN N''''
                ELSE
                    CASE WHEN mid.equality_columns IS NULL THEN N'''' ELSE N'', '' END
                    + mid.inequality_columns
              END
            + N'' )''
            + CASE
                WHEN mid.included_columns IS NULL THEN N''''
                ELSE N'' INCLUDE ('' + mid.included_columns + N'')''
              END
            + N'';''                                            AS CreateIndexStatement,
        migs.avg_user_impact,
        migs.avg_system_impact,
        migs.user_seeks,
        migs.user_scans,
        migs.system_seeks,
        migs.system_scans,
        mid.equality_columns,
        mid.inequality_columns,
        mid.included_columns
    FROM sys.dm_db_missing_index_group_stats                    migs
    INNER JOIN sys.dm_db_missing_index_groups                   mig
        ON migs.group_handle = mig.index_group_handle
    INNER JOIN sys.dm_db_missing_index_details                  mid
        ON mig.index_handle = mid.index_handle
    INNER JOIN sys.objects WITH (NOLOCK)
        ON mid.object_id = sys.objects.object_id
    WHERE migs.group_handle IN (
        SELECT TOP 500 group_handle
        FROM sys.dm_db_missing_index_group_stats WITH (NOLOCK)
        ORDER BY (avg_total_user_cost * avg_user_impact)
                 * (user_seeks + user_scans) DESC
    )
    AND OBJECTPROPERTY(sys.objects.object_id, ''IsUserTable'') = 1;
END
';

-- Return results sorted by highest impact across all databases
SELECT *
FROM #MissingIndexesAllDbs
ORDER BY ImpactFactor DESC, DatabaseName, CreateIndexStatement;

DROP TABLE #MissingIndexesAllDbs;

6 Script 3: Server-Wide Using a Reliable Cursor (Recommended) Advanced

This version iterates through all online user databases using a cursor with explicit conditions. It is more verbose than the sp_MSforeachdb version but is fully documented, will not silently skip databases, and is appropriate for scheduled jobs, automated reporting, and production health check scripts.

-- ============================================================
-- Missing Index Analysis: All Databases via Cursor
-- Reliable server-wide version for scheduled jobs and automation
-- Skips: system databases, offline databases, read-only databases,
--        database snapshots, and databases being restored
-- ============================================================

IF OBJECT_ID('tempdb..#MissingIndexAllDBsCursor', 'U') IS NOT NULL
    DROP TABLE #MissingIndexAllDBsCursor;

CREATE TABLE #MissingIndexAllDBsCursor
(
    DatabaseName            NVARCHAR(128)   NOT NULL,
    ImpactFactor            DECIMAL(20,3)   NOT NULL,
    TableName               NVARCHAR(600)   NULL,
    CreateIndexStatement    NVARCHAR(MAX)   NULL,
    [UserImprovement(%)]    FLOAT           NULL,
    [SystemImprovement(%)]  FLOAT           NULL,
    UserSeeks               BIGINT          NULL,
    UserScans               BIGINT          NULL,
    SystemSeeks             BIGINT          NULL,
    SystemScans             BIGINT          NULL,
    EqualityColumns         NVARCHAR(4000)  NULL,
    InequalityColumns       NVARCHAR(4000)  NULL,
    IncludedColumns         NVARCHAR(4000)  NULL
);

DECLARE @DBName     SYSNAME;
DECLARE @SQL        NVARCHAR(MAX);

DECLARE db_cursor CURSOR LOCAL FAST_FORWARD FOR
    SELECT name
    FROM sys.databases
    WHERE database_id > 4               -- exclude master, tempdb, model, msdb
      AND state_desc    = 'ONLINE'      -- only online databases
      AND is_read_only  = 0             -- exclude read-only databases
      AND source_database_id IS NULL    -- exclude database snapshots
    ORDER BY name;

OPEN db_cursor;
FETCH NEXT FROM db_cursor INTO @DBName;

WHILE @@FETCH_STATUS = 0
BEGIN
    SET @SQL = N'
    USE ' + QUOTENAME(@DBName) + N';

    INSERT INTO #MissingIndexAllDBsCursor
    (
        DatabaseName, ImpactFactor, TableName, CreateIndexStatement,
        [UserImprovement(%)], [SystemImprovement(%)],
        UserSeeks, UserScans, SystemSeeks, SystemScans,
        EqualityColumns, InequalityColumns, IncludedColumns
    )
    SELECT
        N' + QUOTENAME(@DBName, N'''') + N'             AS DatabaseName,
        CAST(
            (migs.avg_total_user_cost * migs.avg_user_impact)
            * (migs.user_seeks + migs.user_scans)
        AS DECIMAL(20,3))                               AS ImpactFactor,
        mid.statement                                   AS TableName,
        N''CREATE NONCLUSTERED INDEX ix_''
            + sys.objects.name COLLATE DATABASE_DEFAULT
            + N''_''
            + REPLACE(REPLACE(REPLACE(
                ISNULL(mid.equality_columns,   N'''')
                + ISNULL(mid.inequality_columns, N''''),
                N''['', N''''), N'']'', N''''), N'', '', N''_'')
            + N'' ON '' + mid.[statement]
            + N'' ( ''
            + ISNULL(mid.equality_columns, N'''')
            + CASE
                WHEN mid.inequality_columns IS NULL THEN N''''
                ELSE CASE WHEN mid.equality_columns IS NULL
                          THEN N'''' ELSE N'', '' END
                     + mid.inequality_columns
              END
            + N'' )''
            + CASE
                WHEN mid.included_columns IS NULL THEN N''''
                ELSE N'' INCLUDE ('' + mid.included_columns + N'')''
              END
            + N'';''                                    AS CreateIndexStatement,
        migs.avg_user_impact,
        migs.avg_system_impact,
        migs.user_seeks,
        migs.user_scans,
        migs.system_seeks,
        migs.system_scans,
        mid.equality_columns,
        mid.inequality_columns,
        mid.included_columns
    FROM sys.dm_db_missing_index_group_stats            migs
    INNER JOIN sys.dm_db_missing_index_groups           mig
        ON migs.group_handle = mig.index_group_handle
    INNER JOIN sys.dm_db_missing_index_details          mid
        ON mig.index_handle = mid.index_handle
    INNER JOIN sys.objects WITH (NOLOCK)
        ON mid.object_id = sys.objects.object_id
    WHERE migs.group_handle IN (
        SELECT TOP 500 group_handle
        FROM sys.dm_db_missing_index_group_stats WITH (NOLOCK)
        ORDER BY (avg_total_user_cost * avg_user_impact)
                 * (user_seeks + user_scans) DESC
    )
    AND OBJECTPROPERTY(sys.objects.object_id, ''IsUserTable'') = 1;
    ';

    BEGIN TRY
        EXEC sp_executesql @SQL;
    END TRY
    BEGIN CATCH
        -- Log the error and continue to the next database
        PRINT N'Error in database ' + @DBName + N': ' + ERROR_MESSAGE();
    END CATCH;

    FETCH NEXT FROM db_cursor INTO @DBName;
END;

CLOSE db_cursor;
DEALLOCATE db_cursor;

-- Return all results ranked by impact across the entire server
SELECT
    DatabaseName,
    ImpactFactor,
    TableName,
    CreateIndexStatement,
    [UserImprovement(%)],
    [SystemImprovement(%)],
    UserSeeks,
    UserScans,
    EqualityColumns,
    InequalityColumns,
    IncludedColumns
FROM #MissingIndexAllDBsCursor
ORDER BY ImpactFactor DESC, DatabaseName, TableName;

DROP TABLE #MissingIndexAllDBsCursor;

Use this version in SQL Agent jobs. The cursor version handles errors per-database without aborting the entire run. If one database causes an error, the PRINT statement logs the message and the cursor moves to the next database. All successfully processed databases still appear in the results. The sp_MSforeachdb version does not have this error isolation.

7 Reading and Interpreting the Output Intermediate

ColumnWhat It Tells YouWhat to Look For
ImpactFactorComposite score combining cost, impact, and frequencyHigher is more important. Focus on the top 10 to 20.
UserImprovement(%)Estimated query performance gain if index existed80%+ means the optimizer considered this index a major improvement
UserSeeks + UserScansHow many times this index was neededHigh counts mean frequent need, not a rare query
EqualityColumnsColumns used in WHERE col = value predicatesThese become the leading key columns of the suggested index
InequalityColumnsColumns used in range predicates (BETWEEN, >, <)These follow equality columns in the key
IncludedColumnsColumns to include in the leaf level for coveringCheck whether these are already in an existing index before creating a new one
CreateIndexStatementReady-to-review T-SQL for the suggested indexAlways validate against existing indexes before executing

8 Safe Workflow Before Creating Any Index Intermediate

The script output is the starting point for investigation, not a list of actions to execute. Follow this workflow for every high-impact suggestion before creating anything in production.

  1. Check existing indexes on the table first. An existing index may already cover some of the suggested columns. Adding a column to an existing index or adding included columns is often better than creating a new index.
  2. Check for duplicate suggestions. Multiple suggestions on the same table may be served by one composite index. Group them by table and compare the column lists before deciding how many indexes to create.
  3. Validate the CREATE INDEX statement. Check that the generated index name does not conflict with an existing index name. The naming pattern is not guaranteed to be unique.
  4. Test in a non-production environment first. Create the index in dev or staging, run the relevant queries, and measure actual logical reads and duration before and after. Estimated impact from the DMV is not always what materializes in practice.
  5. Create one index at a time in production. Monitor for a week after each index creation. Confirm query performance improved. Confirm write performance on the table did not degrade unacceptably.
  6. Document every index created and why. Future DBAs need to know whether an index was created to address a specific workload, or whether it can safely be removed if the workload changes.
-- Before creating a suggested index: check what already exists on the table
SELECT
    i.name                              AS IndexName,
    i.type_desc,
    i.is_unique,
    STRING_AGG(c.name, ', ')
        WITHIN GROUP (ORDER BY ic.key_ordinal)
                                        AS KeyColumns,
    ic.is_included_column
FROM sys.indexes                        i
JOIN sys.index_columns                  ic ON ic.object_id = i.object_id
                                          AND ic.index_id  = i.index_id
JOIN sys.columns                        c  ON c.object_id  = ic.object_id
                                          AND c.column_id  = ic.column_id
WHERE i.object_id = OBJECT_ID('dbo.YourTableName')
GROUP BY i.name, i.type_desc, i.is_unique, ic.is_included_column
ORDER BY i.index_id, ic.is_included_column;

-- Check for name conflicts before creating the suggested index
SELECT name FROM sys.indexes
WHERE object_id = OBJECT_ID('dbo.YourTableName')
  AND name = 'ix_YourGeneratedIndexName';

References


Discover more from SQLYARD

Subscribe to get the latest posts sent to your email.

Leave a Reply

Discover more from SQLYARD

Subscribe now to keep reading and get access to the full archive.

Continue reading