Part 2 — Automate & Prioritize the SQL Server Error Log

Group counts, suppress noise, and email a clean daily report

This follow-up builds on Part 1. Instead of reading the entire error log by hand, we’ll scan the last N hours, classify messages as Errors or Warnings, suppress known noise, group duplicates with counts, and email a compact HTML report. There’s also a section for SQL Agent job failures so you get everything in one place.

What this adds (compared to Part 1)

  • ✅ Reads both SQL Server and SQL Agent logs
  • Classifies messages (Error vs Warning) with high-signal patterns
  • Suppresses noise (you control the list)
  • Groups identical messages and shows counts (top offenders first)
  • ✅ Includes Agent job failures in the last 24h
  • ✅ Sends HTML email only when findings exist (optional: always send)

How to use it

  1. Make sure Database Mail is set up (use the profile from Part 1).
  2. Paste the script into a utility database (e.g., YourUtilityDB).
  3. Update the parameters at the top: hours, mail profile, recipients.
  4. Create a SQL Agent job to run daily (or hourly for critical systems).

Copy-paste script (edit the placeholders)

/* ============================
   Part 2: Error Log Automation
   ============================ */

USE [YourUtilityDB];
GO

DECLARE 
    @HoursBack           INT           = 24,             -- window to scan
    @OnlyWhenFindings    BIT           = 1,              -- 1 = email only if issues found
    @ProfileName         SYSNAME       = N'YourDatabaseMailProfile',
    @Recipients          NVARCHAR(400) = N'ops@example.com',
    @Now                 DATETIME2(0)  = SYSDATETIME(),
    @Start               DATETIME2(0)  = DATEADD(HOUR, -@HoursBack, SYSDATETIME());

/* 1) Collect logs: SQL Server (p2=1) and SQL Agent (p2=2) */

IF OBJECT_ID('tempdb..#Log') IS NOT NULL DROP TABLE #Log;
CREATE TABLE #Log
(
    Source       NVARCHAR(20),        -- 'SQL' or 'Agent'
    LogDate      DATETIME,
    ProcessInfo  NVARCHAR(50),
    [Text]       NVARCHAR(MAX)
);

INSERT INTO #Log (Source, LogDate, ProcessInfo, [Text])
EXEC master.dbo.xp_readerrorlog 0, 1, NULL, NULL, @Start, @Now, N'desc';  -- SQL Server log
UPDATE #Log SET Source = 'SQL' WHERE Source IS NULL;

INSERT INTO #Log (Source, LogDate, ProcessInfo, [Text])
EXEC master.dbo.xp_readerrorlog 0, 2, NULL, NULL, @Start, @Now, N'desc';  -- SQL Agent log
UPDATE #Log SET Source = 'Agent' WHERE Source IS NULL AND ProcessInfo IS NOT NULL;

/* 2) Rules: classification and noise suppression
   - Edit @Rules to change what counts as ERROR vs WARNING
   - Edit @Noise to suppress known-noise messages unique to your shop
*/
DECLARE @Rules TABLE
(
    Pattern NVARCHAR(200) NOT NULL,
    Class   VARCHAR(10)   NOT NULL,   -- 'ERROR' or 'WARNING'
    Weight  INT           NOT NULL,   -- precedence (ERROR=2, WARNING=1)
    Tag     NVARCHAR(50)  NULL
);

-- Critical errors (ERROR, high precedence)
INSERT INTO @Rules (Pattern, Class, Weight, Tag) VALUES
(N'%Error: 823%', 'ERROR', 2, N'I/O'),
(N'%Error: 824%', 'ERROR', 2, N'I/O'),
(N'%Error: 825%', 'ERROR', 2, N'I/O-Retry'),
(N'%Error: 9002%', 'ERROR', 2, N'LogFull'),
(N'%Error: 1101%', 'ERROR', 2, N'FileFull'),
(N'%Error: 1105%', 'ERROR', 2, N'FileFull'),
(N'%Error: 3624%', 'ERROR', 2, N'Assertion'),
(N'%Error: 3041%', 'ERROR', 2, N'BackupFail'),
(N'%backup%failed%', 'ERROR', 2, N'BackupFail'),
(N'%assertion%', 'ERROR', 2, N'Assertion'),
(N'%stack dump%', 'ERROR', 2, N'Dump'),
(N'%is marked SUSPECT%', 'ERROR', 2, N'DBState'),
(N'%failed to start%', 'ERROR', 2, N'Service');

-- Warnings with high signal
INSERT INTO @Rules (Pattern, Class, Weight, Tag) VALUES
(N'%Autogrow of file%', 'WARNING', 1, N'Autogrow'),
(N'%Login failed for user%', 'WARNING', 1, N'LoginFailed'),
(N'%SSPI handshake%', 'WARNING', 1, N'Kerberos'),
(N'%I/O requests taking longer%', 'WARNING', 1, N'LongIO'),
(N'%availability group%', 'WARNING', 1, N'AG'),
(N'%availability replica%', 'WARNING', 1, N'AG'),
(N'%redo queue%', 'WARNING', 1, N'AG'),
(N'Warning:%', 'WARNING', 1, N'Generic');

-- Noise to suppress (safe to send later as INFO). Edit for your environment.
DECLARE @Noise TABLE (Pattern NVARCHAR(200) NOT NULL);
INSERT INTO @Noise (Pattern) VALUES
(N'%Using%backup%compression%'),           -- environment/feature info
(N'%registry%is%configured%for%memory%'),  -- informational
(N'%The%Service%Broker%'),                 -- routine broker chatter
(N'%Database%option%SET%'),                -- database options toggled during deploy
(N'%successfully%opened%'),                -- startup noise
(N'%Login%successful%'),                   -- successes
(N'%Detected%aligned%IO%'),                -- benign storage notes
(N'%Error log has been reinitialized%');   -- log cycling notice

/* 3) Classify, suppress noise, and group with counts */

;WITH Base AS
(
    SELECT L.Source, L.LogDate, L.ProcessInfo, L.[Text]
    FROM #Log AS L
    WHERE L.[Text] IS NOT NULL
      AND NOT EXISTS (SELECT 1 FROM @Noise N WHERE L.[Text] LIKE N.Pattern)
),
Tagged AS
(
    SELECT B.*,
           R.Class,
           R.Weight,
           R.Tag
    FROM Base B
    OUTER APPLY
    (
        SELECT TOP (1) *
        FROM @Rules R
        WHERE B.[Text] LIKE R.Pattern
        ORDER BY R.Weight DESC
    ) AS R
),
Classified AS
(
    SELECT 
        Source,
        LogDate,
        ProcessInfo,
        [Text],
        COALESCE(Class, 'INFO') AS SeverityClass,
        COALESCE(Tag,   'General') AS Tag
    FROM Tagged
)
SELECT *
INTO #Findings
FROM Classified
WHERE SeverityClass IN ('ERROR','WARNING');

-- Normalized message for grouping (trim to keep HTML light)
ALTER TABLE #Findings ADD NormalizedText AS LEFT([Text], 400);

-- Summary groups
SELECT 
    SeverityClass,
    Source,
    Tag,
    NormalizedText,
    COUNT(*)           AS Occurrences,
    MIN(LogDate)       AS FirstSeen,
    MAX(LogDate)       AS LastSeen
INTO #Groups
FROM #Findings
GROUP BY SeverityClass, Source, Tag, NormalizedText;

/* 4) Agent job failures (last @HoursBack) */

IF OBJECT_ID('tempdb..#AgentFails') IS NOT NULL DROP TABLE #AgentFails;
CREATE TABLE #AgentFails
(
    JobName    NVARCHAR(512),
    StepName   NVARCHAR(512),
    RunDate    DATETIME,
    Message    NVARCHAR(MAX)
);

;WITH J AS
(
    SELECT job_id, name AS JobName FROM msdb.dbo.sysjobs
),
H AS
(
    SELECT 
        j.job_id,
        j.JobName,
        h.step_name,
        msdb.dbo.agent_datetime(h.run_date, h.run_time) AS RunDate,
        h.sql_message_id,
        h.message,
        h.run_status
    FROM msdb.dbo.sysjobhistory h
    JOIN J ON h.job_id = J.job_id
    WHERE h.run_status = 0  -- failed
      AND msdb.dbo.agent_datetime(h.run_date, h.run_time) BETWEEN @Start AND @Now
)
INSERT INTO #AgentFails (JobName, StepName, RunDate, Message)
SELECT JobName, step_name, RunDate, message FROM H;

/* 5) Build HTML */

DECLARE @html NVARCHAR(MAX) = 
N'<html><body style="font-family:Segoe UI,Arial,sans-serif;">
<h2>SQL Server Error/Warning Summary (last ' + CAST(@HoursBack AS NVARCHAR(10)) + N' hours)</h2>
<p>Instance: ' + CAST(SERVERPROPERTY('ServerName') AS NVARCHAR(128)) + N'</p>';

-- Critical Errors first
IF EXISTS (SELECT 1 FROM #Groups WHERE SeverityClass='ERROR')
BEGIN
    SET @html += N'<h3 style="color:#b00020;">Errors</h3>
    <table style="border-collapse:collapse;width:100%;" border="1" cellpadding="6">
      <tr style="background:#f6e5e5;">
        <th>Src</th><th>Tag</th><th>Occurrences</th><th>First Seen</th><th>Last Seen</th><th>Example Message</th>
      </tr>' +
      CAST((
        SELECT 
            td = Source, '',
            td = Tag, '',
            td = Occurrences, '',
            td = CONVERT(VARCHAR(19), FirstSeen, 120), '',
            td = CONVERT(VARCHAR(19), LastSeen, 120), '',
            td = NormalizedText, ''
        FROM #Groups
        WHERE SeverityClass='ERROR'
        ORDER BY Occurrences DESC, LastSeen DESC
        FOR XML PATH('tr'), TYPE
      ) AS NVARCHAR(MAX)) + N'</table>';
END

-- Warnings
IF EXISTS (SELECT 1 FROM #Groups WHERE SeverityClass='WARNING')
BEGIN
    SET @html += N'<h3 style="color:#8a6d3b;">Warnings</h3>
    <table style="border-collapse:collapse;width:100%;" border="1" cellpadding="6">
      <tr style="background:#fff6e5;">
        <th>Src</th><th>Tag</th><th>Occurrences</th><th>First Seen</th><th>Last Seen</th><th>Example Message</th>
      </tr>' +
      CAST((
        SELECT 
            td = Source, '',
            td = Tag, '',
            td = Occurrences, '',
            td = CONVERT(VARCHAR(19), FirstSeen, 120), '',
            td = CONVERT(VARCHAR(19), LastSeen, 120), '',
            td = NormalizedText, ''
        FROM #Groups
        WHERE SeverityClass='WARNING'
        ORDER BY Occurrences DESC, LastSeen DESC
        FOR XML PATH('tr'), TYPE
      ) AS NVARCHAR(MAX)) + N'</table>';
END

-- Agent job failures
IF EXISTS (SELECT 1 FROM #AgentFails)
BEGIN
    SET @html += N'<h3 style="color:#b00020;">SQL Agent Job Failures</h3>
    <table style="border-collapse:collapse;width:100%;" border="1" cellpadding="6">
      <tr style="background:#f6e5e5;">
        <th>When</th><th>Job</th><th>Step</th><th>Message</th>
      </tr>' +
      CAST((
        SELECT 
            td = CONVERT(VARCHAR(19), RunDate, 120), '',
            td = JobName, '',
            td = StepName, '',
            td = Message, ''
        FROM #AgentFails
        ORDER BY RunDate DESC
        FOR XML PATH('tr'), TYPE
      ) AS NVARCHAR(MAX)) + N'</table>';
END

IF NOT EXISTS (SELECT 1 FROM #Groups) 
   AND NOT EXISTS (SELECT 1 FROM #AgentFails)
BEGIN
    SET @html += N'<p>No errors or warnings found in the last ' 
              + CAST(@HoursBack AS NVARCHAR(10)) + N' hours.</p>';
END

SET @html += N'</body></html>';

-- Email logic
IF (@OnlyWhenFindings = 0)
   OR EXISTS (SELECT 1 FROM #Groups)
   OR EXISTS (SELECT 1 FROM #AgentFails)
BEGIN
    EXEC msdb.dbo.sp_send_dbmail
        @profile_name = @ProfileName,
        @recipients   = @Recipients,
        @subject      = N'SQL Error/Warning Summary - ' + CAST(SERVERPROPERTY('ServerName') AS NVARCHAR(128)),
        @body         = @html,
        @body_format  = 'HTML';
END

What to treat as “investigate now”

  • Error 823/824/825 (I/O issues), 9002 (log full), 1101/1105 (file full), 3624 (assertion), 3041/backup failed, suspect database, unplanned restarts/dumps.
  • Repeated warnings: many autogrows, spikes in login failures (18456), SSPI handshake failures (17806), long I/O warnings, AG disconnects or large redo queues.
  • Agent failures that impact backups, CHECKDB, index maintenance, ETL, or anything SLO/SLA-related.

How often to run it

  • Daily for most environments (7:00 AM before stand-up works well).
  • Every 15–60 minutes on mission-critical systems, with @OnlyWhenFindings = 1 so you only get emails when there’s something to see.

Tuning tips

  • Adjust @HoursBack to fit your cadence.
  • Edit @Noise to mute harmless, recurring messages unique to your build/deploy pipeline.
  • Add your own patterns to @Rules (set Class, Weight, and a friendly Tag).
  • Keep sp_cycle_errorlog on a weekly schedule so current logs don’t get huge:

EXEC sp_cycle_errorlog;
EXEC sp_cycle_agent_errorlog;

Now lets move onto Step 3


Discover more from SQLYARD

Subscribe to get the latest posts sent to your email.

Leave a Reply

Discover more from SQLYARD

Subscribe now to keep reading and get access to the full archive.

Continue reading