SQL Server Database Not Online: The 2 AM Diagnostic Guide – SQLYARD

Leave a Comment / Articles, DBA Checklist / By SQLYARD

SQL Server Database Not Online: The 2 AM Diagnostic Guide

By SQLYARD · SQLYARD.com · May 2026 · Estimated read: 25–30 min

It is 2 AM. You get an alert or a phone call. A database is not accessible. You open SSMS and see a state you were not expecting next to the database name. The instinct is to act immediately. That instinct is wrong about half the time.

The most important rule in database emergency recovery is this: check before you touch anything. Several database states that look catastrophic resolve themselves without intervention. Others look identical on the surface but require completely different responses. Acting on the wrong diagnosis can make a recoverable situation unrecoverable.

This article is a decision tree for every state SQL Server uses when a database is not online. For each state you get a plain-English explanation of what it means, how to confirm the diagnosis, whether to wait or act, and the exact steps in the right order. Bookmark this page. You will want it when it is 2 AM and you cannot think straight.

Applies to: SQL Server 2016 through SQL Server 2025, all editions. All T-SQL in this article works on all versions in that range. Run everything as a member of the sysadmin fixed server role.

Table of Contents

Start Here

Step Zero: Check the Database State Before Doing Anything
The Six States and What They Mean

State by State

RECOVERING: Wait. It Might Fix Itself.
RECOVERY PENDING: Recovery Stalled, Here Is Why
SUSPECT: Recovery Failed, Careful Steps Required
EMERGENCY: Read-Only Access to a Broken Database
OFFLINE: Taken Down, Bring It Back
RESTORING: Restore in Progress or Orphaned

Reference

Reading the Error Log for All States
The 2 AM Decision Summary
Prevention: Stop This Happening Again

References

1 Step Zero: Check the Database State Before Doing Anything Beginner

Before running any recovery commands, run this query. It tells you the exact state of every database on the instance and gives you the information needed to pick the right diagnostic path.

-- Run this first. Every time. Before anything else.
-- Gives you the state of every database on the instance

SELECT
    name                                        AS DatabaseName,
    state_desc                                  AS CurrentState,
    recovery_model_desc                         AS RecoveryModel,
    log_reuse_wait_desc                         AS LogReuseWait,
    is_read_only,
    is_single_user,
    create_date,
    -- How long since the database was last known online
    CASE
        WHEN state_desc = 'ONLINE' THEN 'Online now'
        ELSE 'NOT ONLINE - investigate'
    END                                         AS Status
FROM sys.databases
WHERE database_id > 4   -- skip system databases
ORDER BY state_desc, name;

-- If you need more detail on a specific database:
SELECT
    name,
    state_desc,
    user_access_desc,
    is_read_only,
    page_verify_option_desc,
    log_reuse_wait_desc
FROM sys.databases
WHERE name = 'YourDatabaseName';

The state_desc column tells you exactly which section of this article to go to. Write down the state before doing anything else. If you need to escalate the issue later, knowing the initial state is important context.

Also check the SQL Server Error Log immediately. Every state transition is logged. The error log entry written at the moment the database changed state usually tells you exactly what caused it. Read that entry before taking any action. Section 9 covers how to read the error log efficiently during an incident.

2 The Six States and What They Mean Beginner

SQL Server databases move through defined states. Understanding what each state actually means is the difference between making the right call and making the situation worse.

State	What It Means	User Access	First Action
ONLINE	Normal operating state. Database is accessible.	Full access	No action needed
RECOVERING	SQL Server is actively running crash recovery after a restart. This is normal. Large databases take longer.	No access yet	Wait and monitor progress
RECOVERY PENDING	Recovery started but stalled. SQL Server cannot proceed without intervention. Usually disk space, missing files, or resource issues.	No access	Check error log, fix the cause
SUSPECT	Recovery was attempted and failed. Database is damaged or corrupt. Most serious state.	No access	Read error log, restore from backup first
EMERGENCY	Manually set by DBA for troubleshooting. Read-only, single-user, bypasses transaction log. Not a failure state by itself.	Read-only, sysadmin only	Check why it was set, proceed with repair
OFFLINE	Database was taken offline, either intentionally or due to an error. Files still exist.	No access	Check error log, bring online
RESTORING	A restore is in progress, or a restore sequence was started but not completed.	No access	Check if restore is actively running

3 RECOVERING: Wait. It Might Fix Itself. Beginner

RECOVERING

Post-Restart Recovery in Progress

Wait First

This is the most important state to understand correctly because it is the most common false alarm. When SQL Server restarts for any reason (patching, server reboot, power cycle, crash), it must run crash recovery on every database before bringing it online. This process replays committed transactions from the log that had not yet been written to the data files, and rolls back any uncommitted transactions that were open when the server stopped.

On a small database this takes seconds. On a large database with a heavy write workload or a long-running transaction that was open at shutdown time, it can take 10, 20, or 30 minutes or more. The database shows as RECOVERING the entire time. This is completely normal. Do not intervene.

The 2 AM rule for RECOVERING state: check the progress query first. If progress is moving and estimated completion is within a reasonable window, wait it out. Restarting SQL Server to try to speed up recovery actually resets the recovery process and makes it take longer.

How to Check Recovery Progress and Estimated Completion Time

-- Check active recovery progress for all databases currently recovering
-- Run this repeatedly every 30 seconds to confirm progress is moving

SELECT
    db_name(database_id)                        AS DatabaseName,
    command,
    percent_complete,
    estimated_completion_time / 1000            AS est_seconds_remaining,
    estimated_completion_time / 60000           AS est_minutes_remaining,
    total_elapsed_time / 1000                   AS elapsed_seconds,
    wait_type,
    status
FROM sys.dm_exec_requests
WHERE command LIKE '%DB%'
AND   command LIKE '%RECOVERY%'
ORDER BY percent_complete DESC;

-- Also check sys.dm_os_tasks for recovery worker threads
SELECT
    t.session_id,
    t.task_state,
    t.context_switches_count,
    s.login_time,
    s.status
FROM sys.dm_os_tasks    t
JOIN sys.dm_exec_sessions s ON s.session_id = t.session_id
WHERE s.program_name LIKE '%recovery%'
OR    s.status = 'background';

What to Look For in the Results

percent_complete is increasing and est_minutes_remaining is a manageable number: wait it out. Recovery is progressing normally. Check back every few minutes.
percent_complete is 0 and not moving after several minutes: the database may have transitioned from RECOVERING to RECOVERY PENDING. Re-run the state check query from Section 1.
est_minutes_remaining shows a very large number (hours): this is normal for databases with very large transaction logs or long-running transactions that were open at shutdown. Recovery time is proportional to log work. Check the error log for any indication of problems while you wait.
No rows returned for your database: recovery completed. Run the state check query to confirm the database is now ONLINE.

-- Check the error log for recovery progress messages
-- SQL Server writes regular progress updates during long recoveries
EXEC sp_readerrorlog 0, 1, 'Recovery of database';
-- Look for messages like:
-- "Recovery of database 'YourDB' (7) is 45% complete (approximately 8 minutes remain)."
-- These messages confirm recovery is progressing and estimate completion

-- Check how much transaction log needs to be processed
SELECT
    name                                        AS DatabaseName,
    log_size_mb     = size * 8.0 / 1024,
    log_used_pct    = FILEPROPERTY(name, 'SpaceUsed') * 100.0 / size
FROM sys.databases d
CROSS APPLY (
    SELECT size FROM sys.master_files
    WHERE database_id = d.database_id AND type = 1
) lf
WHERE d.name = 'YourDatabaseName';

4 RECOVERY PENDING: Recovery Stalled, Here Is Why Intermediate

RECOVERY PENDING

Recovery Started But Cannot Proceed

Investigate and Fix Cause

RECOVERY PENDING is different from RECOVERING. RECOVERING means SQL Server is actively working through the process. RECOVERY PENDING means SQL Server tried to start recovery and hit a problem it cannot work around on its own. The database is not corrupt. Recovery simply cannot proceed until the underlying issue is fixed.

The three most common causes are disk space exhaustion, a missing or inaccessible database file, and insufficient system resources at startup time. The error log always contains the specific reason.

Common Causes and Fixes

Cause 1: Disk Space Exhausted

SQL Server needs free space to write the transaction log during recovery. If the drive hosting the log file has no free space, recovery stalls immediately. This is the most common cause.

-- Check available disk space on the SQL Server host
-- Run in PowerShell on the SQL Server machine
Get-PSDrive -PSProvider FileSystem |
    Select-Object Name, Used, Free,
    @{N='FreePct';E={[math]::Round($_.Free / ($_.Used + $_.Free) * 100, 1)}} |
    Format-Table -AutoSize

-- In T-SQL: check file locations to identify which drives are involved
SELECT
    name,
    physical_name,
    type_desc,
    size * 8 / 1024     AS size_mb,
    is_percent_growth,
    growth
FROM sys.master_files
WHERE database_id = DB_ID('YourDatabaseName');

-- Fix: free up disk space on the affected drive
-- Options: delete old backup files, move other files, expand the volume
-- Then: SQL Server will automatically retry recovery

Cause 2: Missing or Inaccessible Database File

-- Check whether all files for the database are accessible
-- The error log will name the specific file that cannot be opened

EXEC sp_readerrorlog 0, 1, 'YourDatabaseName';
-- Look for: "FCB::Open failed: Could not open file..."
-- or: "Error: 17204, Severity: 16, State: 1"
-- The message will name the exact file path

-- If a file was moved: update the file path in the catalog
-- (database must be in EMERGENCY mode first for this)
ALTER DATABASE YourDatabaseName MODIFY FILE (
    NAME = 'YourDatabaseName_log',
    FILENAME = 'D:\NewPath\YourDatabaseName_log.ldf'
);
-- Then restart SQL Server or take the database offline and bring it back online

Cause 3: Transaction Log Corruption

-- If the log file is corrupt, recovery cannot replay transactions
-- This is more serious -- check error log for log corruption messages

EXEC sp_readerrorlog 0, 1, 'corrupt';
EXEC sp_readerrorlog 0, 1, 'error 9001';
EXEC sp_readerrorlog 0, 1, 'error 9002';

-- If the log is corrupt and you have a recent backup:
-- Restore from backup -- this is always the preferred path

-- If no backup is available:
-- You must rebuild the log, which risks data loss
-- See Section 5 (SUSPECT) for the repair sequence
-- Rebuilding the log puts the database in SUSPECT first

Do not immediately run DBCC CHECKDB or repair commands on a RECOVERY PENDING database. First identify and fix the cause. If the cause is disk space or a missing file, fixing the cause is enough and SQL Server will complete recovery on its own. Jumping to CHECKDB repair before fixing the root cause wastes time and risks unnecessary data loss.

5 SUSPECT: Recovery Failed, Careful Steps Required Intermediate

SUSPECT

Recovery Attempted and Failed

Act Carefully, Backup First

SUSPECT is the most serious non-online state. SQL Server attempted recovery, found damage it could not repair automatically, and marked the database as SUSPECT to prevent further corruption. The database is completely inaccessible to users.

The most important thing to know before starting any repair: always try a restore from backup first. If you have a good backup from before the problem occurred, restoring it is faster, safer, and guarantees no data loss compared to any repair procedure.

Never run REPAIR_ALLOW_DATA_LOSS without exhausting all other options first. As the name states, this command may delete corrupted data permanently. It is a last resort when no backup is available, not a first response to SUSPECT mode.

Step 1: Check What Happened

-- Read the error log for the specific failure that caused SUSPECT
EXEC sp_readerrorlog 0, 1, 'suspect';
EXEC sp_readerrorlog 0, 1, 'error 823';
EXEC sp_readerrorlog 0, 1, 'error 824';
EXEC sp_readerrorlog 0, 1, 'error 825';

-- Error 823: I/O error during read/write -- usually hardware or storage
-- Error 824: SQL Server detected a logical consistency error
-- Error 825: Read succeeded after retry -- warning of impending storage failure

Step 2: Restore From Backup (Always Try This First)

-- Restore from your most recent clean backup
-- This is always the preferred path -- no risk of data loss

RESTORE DATABASE YourDatabaseName
FROM DISK = 'D:\Backups\YourDatabaseName_Full.bak'
WITH REPLACE, NORECOVERY;
GO

-- Apply differential backup if available
RESTORE DATABASE YourDatabaseName
FROM DISK = 'D:\Backups\YourDatabaseName_Diff.bak'
WITH NORECOVERY;
GO

-- Apply transaction log backups to minimize data loss
RESTORE LOG YourDatabaseName
FROM DISK = 'D:\Backups\YourDatabaseName_Log.trn'
WITH NORECOVERY;
GO

-- Bring the database online
RESTORE DATABASE YourDatabaseName WITH RECOVERY;
GO

Step 3: If No Backup Is Available, Use Emergency Repair

-- Last resort only -- data loss is possible
-- Document every step before you start

-- Step 3a: Reset the suspect flag and set Emergency mode
EXEC sp_resetstatus 'YourDatabaseName';
GO
ALTER DATABASE YourDatabaseName SET EMERGENCY;
GO

-- Step 3b: Run CHECKDB to assess damage (read-only, no repair yet)
DBCC CHECKDB ('YourDatabaseName') WITH NO_INFOMSGS, ALL_ERRORMSGS;
-- Read the output carefully before proceeding
-- If CHECKDB shows no errors, the database may recover without repair

-- Step 3c: Set single user mode and run repair
ALTER DATABASE YourDatabaseName
    SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
GO

DBCC CHECKDB ('YourDatabaseName', REPAIR_ALLOW_DATA_LOSS)
    WITH NO_INFOMSGS, ALL_ERRORMSGS;
GO

-- Step 3d: Return to multi-user and bring online
ALTER DATABASE YourDatabaseName SET MULTI_USER;
GO
ALTER DATABASE YourDatabaseName SET ONLINE;
GO

-- Step 3e: Verify database integrity after repair
DBCC CHECKDB ('YourDatabaseName') WITH NO_INFOMSGS, ALL_ERRORMSGS;
GO

For a deeper walkthrough of SUSPECT mode including lab exercises and prevention strategies, see the dedicated SQLYARD article: When Your SQL Server Database Goes SUSPECT and How to Prevent It.

6 EMERGENCY: Read-Only Access to a Broken Database Intermediate

EMERGENCY

Manually Set for Troubleshooting

Assess Then Act

EMERGENCY is not a failure state that SQL Server sets automatically. It is a mode that a DBA sets manually, either as part of a repair procedure or to gain read-only access to a database that is in SUSPECT or RECOVERY PENDING. If you find a database in EMERGENCY mode, the first question is who set it and why.

In EMERGENCY mode the database is read-only, accessible only to sysadmin role members, and bypasses the transaction log. This allows you to query system catalog views and export data from a damaged database before attempting repair or restore.

-- Check if a database is in EMERGENCY mode and who last accessed it
SELECT
    name,
    state_desc,
    user_access_desc,
    is_read_only
FROM sys.databases
WHERE name = 'YourDatabaseName';

-- Check the error log for context on why EMERGENCY was set
EXEC sp_readerrorlog 0, 1, 'EMERGENCY';
EXEC sp_readerrorlog 0, 1, 'YourDatabaseName';

-- If you need to read data from the database in EMERGENCY mode
-- (to export before repair or restore):
USE YourDatabaseName;
SELECT * FROM dbo.ImportantTable;   -- read-only access works
-- INSERT, UPDATE, DELETE will fail in EMERGENCY mode

-- Export critical data before any repair operation
SELECT *
INTO TemporaryExportDB.dbo.ImportantTable_Export
FROM YourDatabaseName.dbo.ImportantTable;

Exiting EMERGENCY Mode

-- If EMERGENCY was set as part of a repair process:
-- Complete the repair steps (Section 5, Step 3) then bring online

-- If EMERGENCY was set accidentally and the database was healthy before:
-- Simply set it back to MULTI_USER and ONLINE
ALTER DATABASE YourDatabaseName SET MULTI_USER;
GO
ALTER DATABASE YourDatabaseName SET ONLINE;
GO

-- Verify state after:
SELECT name, state_desc FROM sys.databases
WHERE name = 'YourDatabaseName';

7 OFFLINE: Taken Down, Bring It Back Beginner

OFFLINE

Database Is Down

Usually Quick to Fix

OFFLINE is the simplest state to deal with. A database goes OFFLINE either because a DBA explicitly took it offline, because SQL Server encountered repeated I/O errors against the database files and took protective action, or because a detach operation was started. The database files are intact and present on disk.

-- Check why the database went offline
-- Look for the ALTER DATABASE OFFLINE command or I/O error messages
EXEC sp_readerrorlog 0, 1, 'offline';
EXEC sp_readerrorlog 0, 1, 'YourDatabaseName';

-- Verify the database files are still accessible
-- Check the file paths from sys.master_files
SELECT
    name,
    physical_name,
    state_desc          AS FileState,
    type_desc
FROM sys.master_files
WHERE database_id = DB_ID('YourDatabaseName');

-- If files are present and accessible, bring the database online:
ALTER DATABASE YourDatabaseName SET ONLINE;
GO

-- Verify:
SELECT name, state_desc FROM sys.databases
WHERE name = 'YourDatabaseName';

If the Database Will Not Come Online

-- If SET ONLINE fails, SQL Server will report the reason
-- Check the error message carefully -- it usually points to a missing file
-- or a permissions problem on the file

-- Check that the SQL Server service account has access to the file path
-- Run in PowerShell on the SQL Server host:
$path = 'D:\SQLData\YourDatabaseName.mdf'
$acl  = Get-Acl $path
$acl.Access | Select-Object IdentityReference, FileSystemRights

-- If the service account is missing from the ACL:
$rule = New-Object System.Security.AccessControl.FileSystemAccessRule(
    "DOMAIN\SQLServiceAccount", "FullControl", "Allow")
$acl.SetAccessRule($rule)
Set-Acl $path $acl

8 RESTORING: Restore in Progress or Orphaned Intermediate

RESTORING

Restore Running or Incomplete

Check Before Acting

A database shows RESTORING for two different reasons that look identical but require opposite responses. Either a restore is actively running and the database will come online when it finishes, or a restore sequence was started with NORECOVERY, the remaining backups were never applied, and the database is stuck in RESTORING permanently until someone either completes or cancels the restore.

-- Check whether a restore is actively running
SELECT
    session_id,
    command,
    percent_complete,
    estimated_completion_time / 60000   AS est_minutes_remaining,
    total_elapsed_time / 1000           AS elapsed_seconds,
    DB_NAME(database_id)                AS DatabaseName
FROM sys.dm_exec_requests
WHERE command LIKE '%RESTORE%';

-- If rows are returned: a restore is actively running
-- Do NOT interrupt it unless absolutely necessary
-- Let it complete and the database will come online automatically

-- If no rows returned: restore is not running
-- The database was left in RESTORING state from a previous NORECOVERY restore
-- You must either complete the restore or bring it online manually

Completing an Orphaned NORECOVERY Restore

-- Option 1: Apply remaining log backups and bring online
-- Use this when you have additional log backups to apply

-- Apply each log backup in sequence:
RESTORE LOG YourDatabaseName
FROM DISK = 'D:\Backups\YourDB_Log_20260525_0200.trn'
WITH NORECOVERY;

RESTORE LOG YourDatabaseName
FROM DISK = 'D:\Backups\YourDB_Log_20260525_0300.trn'
WITH NORECOVERY;

-- Final step: recover the database (no more backups to apply)
RESTORE DATABASE YourDatabaseName WITH RECOVERY;
GO

-- Option 2: Bring online immediately accepting potential data gap
-- Use ONLY when you have no more log backups to apply
-- and need the database online now
-- This will roll back any uncommitted transactions from the last backup
RESTORE DATABASE YourDatabaseName WITH RECOVERY;
GO

-- Verify:
SELECT name, state_desc FROM sys.databases
WHERE name = 'YourDatabaseName';

Never run RESTORE DATABASE WITH RECOVERY if you still have log backups to apply. Once you issue WITH RECOVERY the database comes online and the restore chain is broken. You cannot apply further log backups after recovery. Apply all available logs in sequence first, then issue WITH RECOVERY as the final step.

9 Reading the Error Log for All States Beginner

Every state transition is recorded in the SQL Server Error Log with the reason for the change. Reading the error log first takes 60 seconds and almost always tells you exactly what caused the problem. This is the step most DBAs skip under pressure, and skipping it is why recoveries take longer than they should.

-- Read the most recent error log entries for a specific database
EXEC sp_readerrorlog 0, 1, 'YourDatabaseName';

-- Read for specific error numbers that commonly cause state changes
EXEC sp_readerrorlog 0, 1, 'error 823';   -- I/O error
EXEC sp_readerrorlog 0, 1, 'error 824';   -- logical consistency error
EXEC sp_readerrorlog 0, 1, 'error 825';   -- read retry (storage warning)
EXEC sp_readerrorlog 0, 1, 'error 9001';  -- log not available
EXEC sp_readerrorlog 0, 1, 'error 9002';  -- log file full
EXEC sp_readerrorlog 0, 1, 'error 17204'; -- file open failed
EXEC sp_readerrorlog 0, 1, 'error 17207'; -- file open failed (startup)
EXEC sp_readerrorlog 0, 1, 'suspect';

-- If the current error log does not show the event
-- it may have rolled to an older log file
-- SQL Server keeps up to 6 error log files by default (numbered 0-6)
EXEC sp_readerrorlog 1, 1, 'YourDatabaseName';  -- previous log
EXEC sp_readerrorlog 2, 1, 'YourDatabaseName';  -- two logs back

-- Read error log entries from a specific time window
EXEC sp_readerrorlog 0, 1, 'YourDatabaseName', NULL,
    '2026-05-25 01:00:00', '2026-05-25 03:00:00';

Key Error Log Messages and What They Mean

Error	Message Pattern	What It Means
823	“operating system returned error…”	Hardware or storage I/O failure. Check disk health immediately.
824	“SQL Server detected a logical consistency-based I/O error”	Page checksum failure. Data page is corrupt on disk.
825	“read of the file…succeeded after failing”	Read succeeded on retry. Warning of deteriorating storage reliability.
9001	“The log for database…is not available”	Transaction log file missing, damaged, or inaccessible.
9002	“The transaction log for database…is full”	Log file filled up. Recovery cannot write undo/redo records.
17204	“FCB::Open failed: Could not open file”	A database file cannot be opened. Check path and permissions.
3414	“An error occurred during recovery, preventing the database from restarting”	Recovery failed. Database moved to SUSPECT.
1813	“Could not open new database…CREATE DATABASE is aborted”	File creation failed during database creation or restore.

10 The 2 AM Decision Summary Beginner

When you are woken up at 2 AM, this is the sequence to follow every single time regardless of what the database state shows.

DATABASE IS NOT ONLINE │ ▼ Step 1: Run the state check query (Section 1) │ ├─ RECOVERING ──────► Check progress query. Is percent_complete moving? │ YES: Wait. Check back in 5 minutes. │ NO after 10 min: May have moved to RECOVERY PENDING. │ Re-run state check. │ ├─ RECOVERY PENDING ─► Read error log immediately. │ Disk full? Free space, SQL Server auto-retries. │ Missing file? Fix path, restart DB. │ Log corrupt? Restore from backup. │ ├─ SUSPECT ──────────► Read error log. Do you have a clean backup? │ YES: Restore. Always restore first. │ NO: Emergency repair (last resort, data loss risk). │ ├─ EMERGENCY ────────► Who set this and why? Check error log. │ Part of repair process: continue repair steps. │ Set accidentally: SET ONLINE to recover. │ ├─ OFFLINE ──────────► Read error log. Files accessible? │ YES: ALTER DATABASE SET ONLINE. │ NO: Fix file access, then SET ONLINE. │ └─ RESTORING ────────► Is a restore actively running? YES (rows in dm_exec_requests): Wait, do not interrupt. NO: Apply remaining log backups then WITH RECOVERY. Or WITH RECOVERY alone if no more logs to apply.

11 Prevention: Stop This Happening Again Beginner

Most database state emergencies have the same root causes: no disk space monitoring, no regular CHECKDB, no tested backups, and no alerting on error 823/824/825. Fixing those four things prevents the vast majority of 2 AM calls.

-- 1. Monitor disk space before it causes RECOVERY PENDING
-- Add this as a SQL Agent job running hourly

DECLARE @threshold_pct FLOAT = 15.0;  -- alert when less than 15% free

EXEC xp_fixeddrives;
-- Returns free MB per drive. Alert when any drive drops below threshold.

-- 2. Schedule weekly DBCC CHECKDB on all user databases
-- Catches corruption before it escalates to SUSPECT
-- Use Ola Hallengren's maintenance solution for a production-ready implementation

-- 3. Verify backups are valid and restorable
-- A backup that was never tested is not a backup
RESTORE VERIFYONLY
FROM DISK = 'D:\Backups\YourDatabaseName_Full.bak';
-- Run this after every backup job to confirm the file is readable

-- 4. Enable alerts for storage errors 823, 824, and 825
-- These errors always precede SUSPECT mode if ignored
-- See the SQLYARD severity alerts article for the full setup script:
-- https://sqlyard.com/2024/10/04/sql-server-severity-alerts/

-- 5. Monitor for RECOVERY PENDING and SUSPECT states automatically
-- Add this check to your morning health check script

SELECT
    name,
    state_desc
FROM sys.databases
WHERE state_desc IN ('SUSPECT', 'RECOVERY_PENDING', 'EMERGENCY')
AND   database_id > 4;
-- If this returns any rows, your monitoring should alert immediately

The databases that go SUSPECT at 2 AM and stay down for hours are almost always the ones that had no current tested backup, no disk space monitoring, and no CHECKDB schedule. The databases that recover in 20 minutes are the ones where the DBA already knows the backup is good, the restore procedure is documented, and the error log told them exactly what happened. Build those habits before the 2 AM call comes.

References

Discover more from SQLYARD

Subscribe to get the latest posts sent to your email.

SQL Server Database Not Online: The 2 AM Diagnostic Guide

SQL Server Database Not Online: The 2 AM Diagnostic Guide

1 Step Zero: Check the Database State Before Doing Anything Beginner

2 The Six States and What They Mean Beginner

3 RECOVERING: Wait. It Might Fix Itself. Beginner

Post-Restart Recovery in Progress

How to Check Recovery Progress and Estimated Completion Time

What to Look For in the Results

4 RECOVERY PENDING: Recovery Stalled, Here Is Why Intermediate

Recovery Started But Cannot Proceed

Common Causes and Fixes

Cause 1: Disk Space Exhausted

Cause 2: Missing or Inaccessible Database File

Cause 3: Transaction Log Corruption

5 SUSPECT: Recovery Failed, Careful Steps Required Intermediate

Recovery Attempted and Failed

Step 1: Check What Happened

Step 2: Restore From Backup (Always Try This First)

Step 3: If No Backup Is Available, Use Emergency Repair

6 EMERGENCY: Read-Only Access to a Broken Database Intermediate

Manually Set for Troubleshooting

Exiting EMERGENCY Mode

7 OFFLINE: Taken Down, Bring It Back Beginner

Database Is Down

If the Database Will Not Come Online

8 RESTORING: Restore in Progress or Orphaned Intermediate

Restore Running or Incomplete

Completing an Orphaned NORECOVERY Restore

9 Reading the Error Log for All States Beginner

Key Error Log Messages and What They Mean

10 The 2 AM Decision Summary Beginner

11 Prevention: Stop This Happening Again Beginner

References

Like this:

Related

Discover more from SQLYARD

Leave a ReplyCancel reply

Sign up to receive email updates, fresh news and more!

SQL Server Database Not Online: The 2 AM Diagnostic Guide

1 Step Zero: Check the Database State Before Doing Anything Beginner

2 The Six States and What They Mean Beginner

3 RECOVERING: Wait. It Might Fix Itself. Beginner

Post-Restart Recovery in Progress

How to Check Recovery Progress and Estimated Completion Time

What to Look For in the Results

4 RECOVERY PENDING: Recovery Stalled, Here Is Why Intermediate

Recovery Started But Cannot Proceed

Common Causes and Fixes

Cause 1: Disk Space Exhausted

Cause 2: Missing or Inaccessible Database File

Cause 3: Transaction Log Corruption

5 SUSPECT: Recovery Failed, Careful Steps Required Intermediate

Recovery Attempted and Failed

Step 1: Check What Happened

Step 2: Restore From Backup (Always Try This First)

Step 3: If No Backup Is Available, Use Emergency Repair

6 EMERGENCY: Read-Only Access to a Broken Database Intermediate

Manually Set for Troubleshooting

Exiting EMERGENCY Mode

7 OFFLINE: Taken Down, Bring It Back Beginner

Database Is Down

If the Database Will Not Come Online

8 RESTORING: Restore in Progress or Orphaned Intermediate

Restore Running or Incomplete

Completing an Orphaned NORECOVERY Restore

9 Reading the Error Log for All States Beginner

Key Error Log Messages and What They Mean

10 The 2 AM Decision Summary Beginner

11 Prevention: Stop This Happening Again Beginner

References

Share this:

Like this:

Related

Discover more from SQLYARD

Related Posts

Leave a ReplyCancel reply

Discover more from SQLYARD