The Ultimate SQL Server 2019 and 2022 Health Check Guide

A Complete Playbook for Modern DBAs

SQL Server health checks are one of the few things that never go away in a DBA’s career. It does not matter if you inherit a system, support a system that has been running for years, or walk into a new environment where the last DBA left suddenly. Someone will eventually ask the same question:

Is our SQL Server healthy, and what should we fix first?

Most DBAs know how to run DMV queries. That part is easy. What most DBAs struggle with is delivering a health check that actually answers the question. Management does not want screenshots, random graphs, or a spreadsheet full of numbers. They want a clear diagnosis, evidence, and a clean recommendation that reduces risk and improves stability.

This guide pulls together the full SQLYard way of running a modern health check. Everything here is safe to run on production servers, including real-time OLTP workloads, and everything includes an explanation of why the check matters and how to explain your findings in plain English.

This is not a quick list of DMVs. This is a full professional health assessment that walks through instance-level configuration, storage, TempDB, MAXDOP, parallelism, memory, backups, corruption checks, statistics health, index health, security posture, job health, waits, latch contention, I/O diagnostics, and Availability Group state. All scripts are inline and all Microsoft Docs references are included for further reading.

Run this entire playbook and you will have everything you need to deliver a full health assessment the same day, even on servers you have never touched before.


Part 1 – Why This Health Check Exists

(And What Management Actually Wants)

Sooner or later someone will ask you for a health check. It usually comes in one of these forms:

  • “The system feels slow. Can you check SQL Server?”
  • “We upgraded hardware. Is SQL using it correctly?”
  • “A consultant said our settings were wrong. Can you confirm?”
  • “Can you validate the server before we migrate it?”

What management is really asking for is much simpler:

  1. A clear status
  2. Evidence to back it up
  3. What to fix first

This playbook includes all three. You do not need any paid tools. Everything runs on SQL Server 2019 and 2022 and is safe to run in production.

When you finish, you should be able to summarize the instance in one page:

  • Overall health: Healthy, Needs Attention, or At Risk
  • High-priority findings
  • Medium-priority findings
  • Low-priority clean-up items
  • Next recommended actions

Everything in this guide supports that one-page summary.


Part 2 – Instance Overview, Memory, Settings, Trace Flags, and Startup Health

2.1 Instance Overview

Always start with the basics. This is the header for every health check.

SELECT  
    SERVERPROPERTY('MachineName') AS MachineName,
    SERVERPROPERTY('ServerName') AS ServerName,
    SERVERPROPERTY('Edition') AS Edition,
    SERVERPROPERTY('ProductVersion') AS ProductVersion,
    SERVERPROPERTY('ProductLevel') AS ProductLevel,
    SERVERPROPERTY('EngineEdition') AS EngineEdition,
    sqlserver_start_time AS SqlServerStartTime
FROM sys.dm_os_sys_info;

Reference:
https://learn.microsoft.com/sql/t-sql/functions/serverproperty-transact-sql

What you report:
SQL Server version, edition, and uptime. Note whether it is patched, outdated, or running a discontinued servicing branch.


2.2 Memory Configuration and Pressure

2.2.1 Max Server Memory vs Physical RAM

WITH mem AS
(
    SELECT 
        total_physical_memory_kb / 1024.0 AS physical_memory_MB,
        available_physical_memory_kb / 1024.0 AS available_memory_MB
    FROM sys.dm_os_sys_memory
),
cfg AS
(
    SELECT value_in_use AS max_server_memory_MB
    FROM sys.configurations
    WHERE name = 'max server memory (MB)'
)
SELECT 
    m.physical_memory_MB,
    c.max_server_memory_MB,
    CAST(m.physical_memory_MB * 0.50 AS int) AS recommended_min_MB,
    CAST(m.physical_memory_MB * 0.80 AS int) AS recommended_max_MB,
    m.available_memory_MB AS current_free_OS_memory_MB,
    CASE 
        WHEN c.max_server_memory_MB IS NULL THEN 'Not configured'
        WHEN c.max_server_memory_MB > m.physical_memory_MB * 0.85 THEN 'Likely too high'
        WHEN c.max_server_memory_MB < m.physical_memory_MB * 0.40 THEN 'Likely too low'
        ELSE 'Within reasonable range'
    END AS recommendation
FROM mem m
CROSS JOIN cfg c;

Reference:
https://learn.microsoft.com/sql/database-engine/configure-windows/server-memory-server-configuration-options

What you report:
SQL Server should get about 50 to 80 percent of RAM depending on workload.


2.2.2 Memory Pressure Indicators

SELECT 
    'OS memory' AS source,
    total_physical_memory_kb / 1024.0 AS total_physical_MB,
    available_physical_memory_kb / 1024.0 AS available_physical_MB,
    system_memory_state_desc
FROM sys.dm_os_sys_memory

UNION ALL

SELECT 
    'SQL process' AS source,
    physical_memory_in_use_kb / 1024.0 AS total_physical_MB,
    large_page_allocations_kb / 1024.0 AS available_physical_MB,
    CASE 
        WHEN process_physical_memory_low = 1 THEN 'SQL under memory pressure'
        ELSE 'SQL memory state OK'
    END AS system_memory_state_desc
FROM sys.dm_os_process_memory;

Reference:
https://learn.microsoft.com/sql/relational-databases/system-dynamic-management-views/sys-dm-os-process-memory-transact-sql


2.3 Dangerous Instance-Level Settings

SELECT  
    c.name,
    c.value_in_use,
    CASE 
        WHEN c.name IN (
            'priority boost',
            'lightweight pooling',
            'affinity mask',
            'affinity64 mask',
            'affinity I/O mask',
            'affinity64 I/O mask'
        ) THEN 'HIGH risk'
        WHEN c.name IN ('max worker threads', 'min memory per query (KB)') THEN 'MEDIUM risk'
        ELSE 'NORMAL'
    END AS risk_level
FROM sys.configurations AS c
WHERE c.name IN (
    'affinity mask',
    'affinity64 mask',
    'affinity I/O mask',
    'affinity64 I/O mask',
    'lightweight pooling',
    'priority boost',
    'max worker threads',
    'min memory per query (KB)'
)
ORDER BY risk_level DESC;

Reference:
https://learn.microsoft.com/sql/database-engine/configure-windows/server-configuration-options-sql-server


2.4 Trace Flag Inventory and Risk Assessment

DBCC TRACESTATUS(-1);

Reference:
https://learn.microsoft.com/sql/t-sql/database-console-commands/dbcc-tracestatus-transact-sql

Classify trace flags into:

  • Common: 3226
  • Legacy on modern builds: 1117, 1118, 2371
  • Diagnostic spam: 3605, 1204, 1222
  • Advanced tuning: 8048, 7457, 4199, 834

Do not remove flags without documented reasoning.


2.5 OS-Level Best Practices

2.5.1 NTFS Allocation Unit Size

SQL Server data, log, TempDB drives should use 64 KB allocation units.

Reference:
https://learn.microsoft.com/sql/sql-server/install/hardware-and-software-requirements-for-installing-sql-server


2.5.2 Windows Indexing

Ensure “Allow files on this drive to have contents indexed” is disabled unless FILESTREAM is used.


2.5.3 Instant File Initialization (IFI)

SQL confirms IFI at startup. Verify using:

EXEC xp_readerrorlog 0, 1, 'Database Instant File Initialization';

Reference:
https://learn.microsoft.com/sql/relational-databases/databases/instant-file-initialization

IFI must be enabled at Windows level using the privilege
Perform volume maintenance tasks.


2.5.4 Optimize for Ad Hoc Workloads

EXEC sp_configure 'optimize for ad hoc workloads';

Reference:
https://learn.microsoft.com/sql/relational-databases/performance/optimize-for-ad-hoc-workloads


2.5.5 Soft NUMA

Check Soft NUMA with:

SELECT node_id, memory_node_id, cpu_count, online_scheduler_count
FROM sys.dm_os_nodes
WHERE node_state_desc = 'ONLINE';

Reference:
https://learn.microsoft.com/sql/database-engine/configure-windows/soft-numa


2.5.6 TLS, Certificates, and Drivers

Check the clients:

SELECT 
    session_id,
    client_interface_name,
    client_version,
    program_name,
    host_name
FROM sys.dm_exec_connections;

If you see ODBC Driver 18 or SqlClient 5.x errors:

Search error log:

EXEC xp_readerrorlog 0, 1, 'TLS';
EXEC xp_readerrorlog 0, 1, 'certificate';
EXEC xp_readerrorlog 0, 1, 'failed to initialize encryption';

Reference:
https://learn.microsoft.com/sql/connect/odbc/download-odbc-driver-for-sql-server


Part 3 – CPU, MAXDOP, Parallelism, and TempDB Health

3.1 NUMA and Scheduler Layout

SELECT 
    parent_node_id,
    scheduler_id,
    cpu_id,
    is_online,
    is_idle
FROM sys.dm_os_schedulers
WHERE status = 'VISIBLE ONLINE'
ORDER BY parent_node_id, scheduler_id;

3.2 MAXDOP and Cost Threshold for Parallelism

EXEC sp_configure 'show advanced options', 1;
RECONFIGURE;

EXEC sp_configure 'max degree of parallelism';
EXEC sp_configure 'cost threshold for parallelism';

Reference:
https://learn.microsoft.com/sql/database-engine/configure-windows/configure-the-max-degree-of-parallelism-server-configuration-option

Use Query Store or cached plan analysis to validate real parallelism usage:

SELECT 
    COUNT(*) AS total_plans,
    SUM(CASE WHEN qp.query_plan.exist('//RelOp[@Parallel = 1]') = 1 THEN 1 ELSE 0 END) AS parallel_plans
FROM sys.dm_exec_cached_plans cp
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) qp
WHERE cp.cacheobjtype = 'Compiled Plan';

3.3 TempDB Configuration

3.3.1 File Layout

USE tempdb;
SELECT 
    name,
    file_id,
    size * 8 / 1024 AS size_MB,
    growth * 8 / 1024 AS growth_MB,
    physical_name
FROM sys.database_files;

Reference:
https://learn.microsoft.com/sql/relational-databases/databases/tempdb-database


3.3.2 Version Store

SELECT 
    SUM(version_store_reserved_page_count) * 8 / 1024 AS version_store_MB
FROM sys.dm_db_file_space_usage;

3.3.3 Latch Contention

SELECT TOP 20 
    wait_type,
    wait_time_ms,
    waiting_tasks_count
FROM sys.dm_os_wait_stats
WHERE wait_type LIKE 'PAGELATCH_%'
ORDER BY wait_time_ms DESC;

Reference:
https://learn.microsoft.com/sql/relational-databases/diagnose/tempdb-contention


Part 4 – Database Safety, Backups, Integrity, VLFs, Growth, Recovery

4.1 Database Safety Configuration

SELECT 
    name,
    state_desc,
    recovery_model_desc,
    compatibility_level,
    page_verify_option_desc,
    is_auto_close_on,
    is_auto_shrink_on,
    is_encrypted,
    is_read_committed_snapshot_on
FROM sys.databases
WHERE database_id > 4;

4.2 Backup Health

SELECT 
    bs.database_name,
    MAX(CASE WHEN bs.type = 'D' THEN backup_finish_date END) AS last_full_backup,
    MAX(CASE WHEN bs.type = 'I' THEN backup_finish_date END) AS last_diff_backup,
    MAX(CASE WHEN bs.type = 'L' THEN backup_finish_date END) AS last_log_backup
FROM msdb.dbo.backupset bs
GROUP BY bs.database_name;

4.2.1 DBCC CHECKDB History

EXEC xp_readerrorlog 0, 1, 'DBCC CHECKDB';

Reference:
https://learn.microsoft.com/sql/t-sql/database-console-commands/dbcc-checkdb-transact-sql


4.3 VLF Health

SELECT 
    DB_NAME(database_id) AS database_name,
    COUNT(*) AS vlf_count
FROM sys.dm_db_log_info(NULL)
GROUP BY database_id
ORDER BY vlf_count DESC;

Reference:
https://learn.microsoft.com/sql/relational-databases/troubleshoot/troubleshoot-transaction-log-vlfs


Part 5 – Index Health, Fragmentation, Stats, Heaps, Design Issues

5.1 Heaps

SELECT  
    s.name AS schema_name,
    t.name AS table_name,
    p.rows
FROM sys.tables t
JOIN sys.schemas s ON t.schema_id = s.schema_id
JOIN sys.partitions p ON t.object_id = p.object_id AND p.index_id = 0
WHERE t.is_ms_shipped = 0;

5.2 Heaps with Forwarded Records

SELECT 
    OBJECT_NAME(object_id) AS table_name,
    forwarded_record_count
FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'DETAILED')
WHERE index_id = 0 AND forwarded_record_count > 0;

5.3 Fragmentation Snapshot (All Databases)

IF OBJECT_ID('tempdb..#frag') IS NOT NULL DROP TABLE #frag;

CREATE TABLE #frag (
    database_name sysname,
    object_name   sysname,
    index_name    sysname,
    index_type    nvarchar(60),
    avg_fragmentation_in_percent float,
    page_count    int
);

DECLARE @db sysname, @sql nvarchar(max);

DECLARE dbs CURSOR FAST_FORWARD FOR
    SELECT name FROM sys.databases
    WHERE database_id > 4 AND state = 0;  -- online user dbs

OPEN dbs;

FETCH NEXT FROM dbs INTO @db;
WHILE @@FETCH_STATUS = 0
BEGIN
    SET @sql = N'
    USE ' + QUOTENAME(@db) + ';
    INSERT INTO #frag
    SELECT 
        DB_NAME(),
        OBJECT_NAME(ips.object_id),
        i.name,
        i.type_desc,
        ips.avg_fragmentation_in_percent,
        ips.page_count
    FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, ''SAMPLED'') ips
    JOIN sys.indexes i
        ON ips.object_id = i.object_id
       AND ips.index_id  = i.index_id
    WHERE ips.index_id > 0
      AND ips.page_count >= 1000;';

    EXEC (@sql);
    FETCH NEXT FROM dbs INTO @db;
END

CLOSE dbs; DEALLOCATE dbs;

SELECT *
FROM #frag
ORDER BY avg_fragmentation_in_percent DESC;

5.4 Missing Indexes

SELECT TOP 25
    user_seeks, user_scans, avg_total_user_cost, avg_user_impact,
    DB_NAME(database_id) AS dbname,
    OBJECT_NAME(object_id) AS table_name,
    equality_columns, inequality_columns, included_columns
FROM sys.dm_db_missing_index_details;

Reference:
https://learn.microsoft.com/sql/relational-databases/system-dynamic-management-views/sys-dm-db-missing-index-details-transact-sql


5.5 Overlapping Indexes

/* Overlapping / duplicate index detection */

;WITH index_cols AS
(
    SELECT 
        i.object_id,
        i.index_id,
        i.name AS index_name,
        i.is_unique,
        i.type_desc,
        ic.key_ordinal,
        c.name AS column_name,
        ic.is_included_column,
        ic.is_descending_key
    FROM sys.indexes i
    JOIN sys.index_columns ic
        ON i.object_id = ic.object_id
       AND i.index_id  = ic.index_id
    JOIN sys.columns c
        ON ic.object_id = c.object_id
       AND ic.column_id = c.column_id
    WHERE i.is_hypothetical = 0
      AND i.type_desc = 'NONCLUSTERED'
),
agg AS
(
    SELECT 
        c1.object_id,
        c1.index_id,
        c1.index_name,
        c1.is_unique,
        c1.type_desc,
        key_columns =
            STUFF((SELECT ',' + c2.column_name
                   FROM index_cols c2
                   WHERE c2.object_id = c1.object_id
                     AND c2.index_id = c1.index_id
                     AND c2.is_included_column = 0
                   ORDER BY c2.key_ordinal
                   FOR XML PATH(''), TYPE).value('.', 'nvarchar(max)'), 1, 1, ''),
        include_columns =
            STUFF((SELECT ',' + c2.column_name
                   FROM index_cols c2
                   WHERE c2.object_id = c1.object_id
                     AND c2.index_id = c1.index_id
                     AND c2.is_included_column = 1
                   ORDER BY c2.column_name
                   FOR XML PATH(''), TYPE).value('.', 'nvarchar(max)'), 1, 1, '')
    FROM index_cols c1
    GROUP BY 
        c1.object_id,
        c1.index_id,
        c1.index_name,
        c1.is_unique,
        c1.type_desc
)
SELECT
    s.name AS schema_name,
    t.name AS table_name,
    a1.index_name AS index_name_1,
    a1.key_columns AS key_columns_1,
    a1.include_columns AS include_columns_1,
    a2.index_name AS index_name_2,
    a2.key_columns AS key_columns_2,
    a2.include_columns AS include_columns_2
FROM agg a1
JOIN agg a2
    ON a1.object_id = a2.object_id
   AND a1.index_id < a2.index_id
   AND a1.key_columns = a2.key_columns
JOIN sys.tables t
    ON a1.object_id = t.object_id
JOIN sys.schemas s
    ON t.schema_id = s.schema_id
WHERE t.is_ms_shipped = 0
ORDER BY 
    s.name,
    t.name,
    a1.key_columns,
    a1.index_name,
    a2.index_name;

5.6 Statistics Health

SELECT 
    OBJECT_NAME(sp.object_id) AS table_name,
    s.name AS stat_name,
    sp.last_updated,
    sp.modification_counter
FROM sys.stats s
CROSS APPLY sys.dm_db_stats_properties(s.object_id, s.stats_id) sp
WHERE sp.last_updated IS NOT NULL
ORDER BY sp.last_updated ASC;

5.7 Top 10 Largest Tables

SELECT TOP 10
    s.name AS schema_name,
    t.name AS table_name,
    SUM(p.rows) AS row_count,
    SUM(a.total_pages) * 8 / 1024 AS total_size_MB
FROM sys.tables t
JOIN sys.schemas s ON t.schema_id = s.schema_id
JOIN sys.partitions p ON t.object_id = p.object_id
JOIN sys.allocation_units a ON p.partition_id = a.container_id
GROUP BY s.name, t.name
ORDER BY total_size_MB DESC;

Part 6 – Security, Jobs, Waits, I/O, Latches, Blocking, Top Queries

6.1 Server Roles

SELECT 
    l.name AS principal_name,
    r.name AS role_name
FROM sys.server_principals l
JOIN sys.server_role_members rm ON l.principal_id = rm.member_principal_id
JOIN sys.server_principals r ON rm.role_principal_id = r.principal_id;

6.2 Database Roles

/* Database-level powerful role membership across all databases */

IF OBJECT_ID('tempdb..#db_roles') IS NOT NULL
    DROP TABLE #db_roles;

CREATE TABLE #db_roles
(
    database_name sysname,
    principal_name sysname,
    principal_type nvarchar(60),
    role_name sysname
);

DECLARE @db sysname;
DECLARE @sql nvarchar(max);

DECLARE db_cursor CURSOR FAST_FORWARD FOR
SELECT name
FROM sys.databases
WHERE database_id > 4      -- skip system DBs
  AND state = 0;           -- ONLINE

OPEN db_cursor;

FETCH NEXT FROM db_cursor INTO @db;

WHILE @@FETCH_STATUS = 0
BEGIN
    SET @sql = N'
    INSERT INTO #db_roles (database_name, principal_name, principal_type, role_name)
    SELECT 
        N''' + @db + ''' AS database_name,
        dp.name AS principal_name,
        dp.type_desc AS principal_type,
        dr.name AS role_name
    FROM ' + QUOTENAME(@db) + '.sys.database_principals dp
    JOIN ' + QUOTENAME(@db) + '.sys.database_role_members drm
        ON dp.principal_id = drm.member_principal_id
    JOIN ' + QUOTENAME(@db) + '.sys.database_principals dr
        ON drm.role_principal_id = dr.principal_id
    WHERE dr.name IN (''db_owner'', ''db_securityadmin'', ''db_ddladmin'');
    ';

    EXEC (@sql);

    FETCH NEXT FROM db_cursor INTO @db;
END

CLOSE db_cursor;
DEALLOCATE db_cursor;

SELECT *
FROM #db_roles
ORDER BY database_name, role_name, principal_name;

6.3 SQL Agent Job Health

SELECT 
    j.name,
    MAX(h.run_date) AS last_run_date,
    SUM(CASE WHEN h.run_status = 0 THEN 1 ELSE 0 END) AS failure_count
FROM msdb.dbo.sysjobs j
LEFT JOIN msdb.dbo.sysjobhistory h ON j.job_id = h.job_id
GROUP BY j.name;

6.4 Wait Stats

SELECT TOP 20
    wait_type, wait_time_ms, signal_wait_time_ms
FROM sys.dm_os_wait_stats
WHERE wait_type NOT LIKE 'SLEEP%'
ORDER BY wait_time_ms DESC;

Reference:
https://learn.microsoft.com/sql/relational-databases/system-dynamic-management-views/sys-dm-os-wait-stats-transact-sql


6.5 I/O Health

SELECT 
    DB_NAME(vfs.database_id) AS dbname,
    vfs.io_stall_read_ms / NULLIF(vfs.num_of_reads, 0) AS read_latency_ms,
    vfs.io_stall_write_ms / NULLIF(vfs.num_of_writes, 0) AS write_latency_ms
FROM sys.dm_io_virtual_file_stats(NULL, NULL) vfs
JOIN sys.master_files mf ON vfs.database_id = mf.database_id
ORDER BY read_latency_ms DESC;

Reference:
https://learn.microsoft.com/sql/relational-databases/system-dynamic-management-views/sys-dm-io-virtual-file-stats-transact-sql


6.6 Latch Diagnostics

SELECT TOP 20 
    wait_type, wait_time_ms, waiting_tasks_count
FROM sys.dm_os_wait_stats
WHERE wait_type LIKE 'PAGELATCH%'
ORDER BY wait_time_ms DESC;

6.7 Top CPU Queries

SELECT TOP 20
    total_worker_time / 1000 AS total_cpu_ms,
    execution_count,
    total_worker_time / execution_count / 1000 AS avg_cpu_ms,
    text
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle);

Part 7 – Availability Groups and HADR Health

7.1 Replica State

SELECT 
    ag.name,
    ar.replica_server_name,
    rs.role_desc,
    rs.synchronization_health_desc,
    rs.connected_state_desc
FROM sys.availability_groups ag
JOIN sys.availability_replicas ar ON ag.group_id = ar.group_id
JOIN sys.dm_hadr_availability_replica_states rs ON ar.replica_id = rs.replica_id;

7.2 Database Replica State

SELECT
    ag.name,
    dbcs.database_name,
    dbcs.is_failover_ready,
    dbcs.synchronization_state_desc,
    dbcs.synchronization_health_desc,
    dbcs.log_send_queue_size,
    dbcs.redo_queue_size
FROM sys.dm_hadr_database_replica_cluster_states dbcs
JOIN sys.availability_groups ag ON dbcs.group_id = ag.group_id;

7.3 AG Redo / Send Queue Diagnostics

SELECT 
    ar.replica_server_name,
    drs.database_id,
    DB_NAME(drs.database_id) AS database_name,
    drs.redo_queue_size,
    drs.redo_rate,
    drs.log_send_queue_size,
    drs.log_send_rate
FROM sys.dm_hadr_database_replica_states drs
JOIN sys.availability_replicas ar ON drs.replica_id = ar.replica_id;

Reference:
https://learn.microsoft.com/sql/database-engine/availability-groups/windows/monitor-availability-groups


Part 8 – Workshop Flow and Final Executive Summary

Run Order for a Same-Day Health Check

  1. Instance overview
  2. Memory / CPU / settings
  3. TempDB
  4. Database safety
  5. Backup and CHECKDB
  6. Index and statistics health
  7. Security
  8. SQL Agent jobs
  9. Waits, latches, I/O
  10. AG state

Executive Summary Template

SQL Server Health Summary – <Instance>

Overall Status: Needs Attention

High Priority

  • Issues that impact stability or recoverability

Medium Priority

  • Issues that impact performance

Low Priority

  • Cleanup or tuning improvements

Next Steps

  • Recommended actions with timelines

Discover more from SQLYARD

Subscribe to get the latest posts sent to your email.

Leave a Reply

Discover more from SQLYARD

Subscribe now to keep reading and get access to the full archive.

Continue reading