Monitoring Row Count Drift Between Source and Remote Destination

Monitoring Row Count Drift Between Source and Remote Destination

When you run imports, ETL, or replication, one of the simplest early-warning checks is: “do the row counts match?”
It isn’t a full validation, but it catches common problems like “replication is behind” or “ETL stopped loading” before end users notice.

This blog gives you a full solution:

  • T-SQL stored procedures with configuration tables
  • Logging mismatches and differences
  • Optional lag estimate based on a timestamp column
  • Email alerts
  • Fast all-tables row count query
  • SSIS snapshot + compare workflow
  • PowerShell one-liner
  • Alternatives and upgrades
  • A step-by-step hands-on workshop (with screenshots guidance)

Assumptions

  • You have a linked server to the destination (e.g. LS_DEST).
  • Database Mail is configured.
  • You have a DBA utility database (e.g. DBA).

1) Configuration and Logging Tables

USE [DBA];
GO

-- What pairs to compare
IF OBJECT_ID('dbo.RowCountPairs','U') IS NULL
CREATE TABLE dbo.RowCountPairs
(
    PairID            int IDENTITY(1,1) PRIMARY KEY,
    SourceDB          sysname NOT NULL,
    SourceSchema      sysname NOT NULL,
    SourceTable       sysname NOT NULL,
    DestLinkedServer  sysname NOT NULL,
    DestDB            sysname NOT NULL,
    DestSchema        sysname NOT NULL,
    DestTable         sysname NOT NULL,
    SourceFilter      nvarchar(1000) NULL,
    DestFilter        nvarchar(1000) NULL,
    WatermarkColumn   sysname NULL,
    IsEnabled         bit NOT NULL DEFAULT(1)
);

-- Results
IF OBJECT_ID('dbo.RowCountResults','U') IS NULL
CREATE TABLE dbo.RowCountResults
(
    ResultID          bigint IDENTITY(1,1) PRIMARY KEY,
    RunID             uniqueidentifier NOT NULL,
    PairID            int NULL,
    CheckedAtUtc      datetime2(3) NOT NULL DEFAULT (SYSUTCDATETIME()),
    SourceCount       bigint NULL,
    DestCount         bigint NULL,
    CountDiff         bigint NULL,
    IsMatch           bit NULL,
    EstimatedLagSec   int NULL,
    Notes             nvarchar(4000) NULL
);

-- Run headers
IF OBJECT_ID('dbo.RowCountRuns','U') IS NULL
CREATE TABLE dbo.RowCountRuns
(
    RunID             uniqueidentifier NOT NULL PRIMARY KEY,
    StartedAtUtc      datetime2(3) NOT NULL DEFAULT (SYSUTCDATETIME()),
    FinishedAtUtc     datetime2(3) NULL,
    TotalPairs        int NULL,
    TotalMismatches   int NULL
);

Seed pairs:

INSERT dbo.RowCountPairs
    (SourceDB, SourceSchema, SourceTable, DestLinkedServer, DestDB, DestSchema, DestTable, SourceFilter, DestFilter, WatermarkColumn)
VALUES
    ('Sales', 'dbo', 'Orders',   'LS_DEST', 'Sales', 'dbo', 'Orders',   NULL, NULL, 'ModifiedDate'),
    ('Sales', 'dbo', 'Customers','LS_DEST', 'Sales', 'dbo', 'Customers', 'IsActive = 1', 'IsActive = 1', NULL);

2) Stored Procedure to Compare and Log

The main procedure loops through enabled pairs, compares counts, and logs differences.
👉 Use the full code block from the original blog (unchanged).


3) Row Count Query for All Tables

Quick count of every table in a database (fast alternative to COUNT(*)):

SELECT
    s.name  AS SchemaName,
    t.name  AS TableName,
    SUM(p.row_count) AS RowCount
FROM sys.dm_db_partition_stats AS p
JOIN sys.tables AS t ON t.object_id = p.object_id
JOIN sys.schemas AS s ON s.schema_id = t.schema_id
WHERE p.index_id IN (0,1)
  AND t.is_ms_shipped = 0
GROUP BY s.name, t.name
ORDER BY s.name, t.name;

4) Snapshot + Compare Tables (SSIS-friendly)

Snapshot both sides, then compare:

USE [DBA];
GO

IF OBJECT_ID('dbo.RowCountSnapshot','U') IS NULL
CREATE TABLE dbo.RowCountSnapshot
(
    SnapshotID    bigint IDENTITY(1,1) PRIMARY KEY,
    RunID         uniqueidentifier NOT NULL,
    Side          varchar(16) NOT NULL,      -- 'Source' or 'Dest'
    DBName        sysname NOT NULL,
    SchemaName    sysname NOT NULL,
    TableName     sysname NOT NULL,
    RowCount      bigint NOT NULL,
    CapturedAtUtc datetime2(3) NOT NULL DEFAULT (SYSUTCDATETIME())
);
GO

CREATE OR ALTER PROCEDURE dbo.Compare_RowCountSnapshots
    @RunID uniqueidentifier
AS
BEGIN
    SET NOCOUNT ON;

    ;WITH src AS (
        SELECT SchemaName, TableName, RowCount
        FROM dbo.RowCountSnapshot
        WHERE RunID = @RunID AND Side = 'Source'
    ),
    dst AS (
        SELECT SchemaName, TableName, RowCount
        FROM dbo.RowCountSnapshot
        WHERE RunID = @RunID AND Side = 'Dest'
    )
    INSERT dbo.RowCountResults
        (RunID, PairID, SourceCount, DestCount, CountDiff, IsMatch, Notes)
    SELECT
        @RunID,
        NULL,
        s.RowCount,
        d.RowCount,
        s.RowCount - ISNULL(d.RowCount,0),
        CASE WHEN s.RowCount = ISNULL(d.RowCount,-1) THEN 1 ELSE 0 END,
        CONCAT(s.SchemaName,'.',s.TableName)
    FROM src s
    FULL OUTER JOIN dst d
      ON d.SchemaName = s.SchemaName
     AND d.TableName  = s.TableName;
END
GO

5) Running Manually

EXEC dbo.Run_RowCountChecks
    @AlertOnDiff = 1,
    @DiffThreshold = 0,
    @MailProfile = 'DBA-Mail',
    @MailTo = 'dba-team@yourorg.com';

6) Scheduling with SQL Agent

  • Create a job called Row Count Drift Check.
  • Step 1: run Run_RowCountChecks.
  • Schedule: every 10 minutes (replication) or hourly (batch loads).

7) Reading Results

Recent mismatches:

SELECT TOP (100)
    R.CheckedAtUtc,
    P.SourceDB, P.SourceSchema, P.SourceTable,
    P.DestLinkedServer, P.DestDB, P.DestSchema, P.DestTable,
    R.SourceCount, R.DestCount, R.CountDiff, R.EstimatedLagSec
FROM dbo.RowCountResults R
JOIN dbo.RowCountPairs P ON P.PairID = R.PairID
WHERE R.IsMatch = 0
ORDER BY R.CheckedAtUtc DESC;

8) Sampling Keys to Find Missing Rows

DECLARE @Sample int = 10000;

WITH src AS (
    SELECT TOP (@Sample) PK = Id
    FROM Sales.dbo.Orders
    ORDER BY NEWID()
),
dst AS (
    SELECT Id FROM [LS_DEST].Sales.dbo.Orders
)
SELECT MissingOnDest = COUNT(*)
FROM src
EXCEPT
SELECT PK FROM dst;

9) Performance Notes

  • COUNT_BIG(*) scans; sys.dm_db_partition_stats is faster.
  • Apply watermark filters for “is replication behind?” checks.
  • Keep filters identical on both sides.

10) SSIS Approach

Option A — Classic

  • OLE DB Source → SELECT COUNT_BIG(*)
  • Row Count transform → Script → compare
  • Insert into RowCountResults

Option B — Snapshot + Compare (recommended)

  • Run all-tables query on Source and Dest.
  • Insert into RowCountSnapshot.
  • Call Compare_RowCountSnapshots.

11) Alternatives Simpler Than SSIS

PowerShell

$srcCnt = (Invoke-Sqlcmd -ServerInstance "SRC-SQL" -Database "Sales" -Query "SELECT SUM(row_count) FROM sys.dm_db_partition_stats WHERE object_id = OBJECT_ID('Sales.dbo.Orders') AND index_id IN (0,1)").Column1

$dstCnt = (Invoke-Sqlcmd -ServerInstance "DEST-SQL" -Database "Sales" -Query "SELECT SUM(row_count) FROM sys.dm_db_partition_stats WHERE object_id = OBJECT_ID('Sales.dbo.Orders') AND index_id IN (0,1)").Column1

$diff = $srcCnt - $dstCnt

Agent Job

Schedule Run_RowCountChecks — lightest footprint.


12) Why This Helps

  • Quick signal for replication lag.
  • Lag awareness with watermark.
  • Low overhead (counts are cheap).
  • Actionable alerts → DBAs know where to check.

13) Nice Upgrades

  • Partition-level checksums
  • Per-day windows
  • Dashboards (SSRS / Power BI)
  • Maintenance window suppression

14) Hands-On Workshop (Detailed for Jr DBA)

Goal

Set up SSIS monitoring, log mismatches, and test drift detection.


Part 1. Setup

  1. Create DBA DB + run table scripts.
  2. Verify LS_DEST linked server.
  3. Configure Database Mail.

Part 2. Build SSIS Package

Control Flow (top to bottom):

  1. Script Task → Generate RunID.
  2. Execute SQL Task → Insert into RowCountRuns.
  3. Data Flow Task → “Snapshot Source”.
  4. Data Flow Task → “Snapshot Dest”.
  5. Execute SQL Task → Compare snapshots.
  6. Execute SQL Task → Finalize run.
  7. (Optional) Execute SQL Task → Send email if mismatches.

Data Flow for Source/Dest:

  • OLE DB Source → All-tables row count query.
  • Derived Column → add RunID, Side (“Source”/“Dest”), DBName.
  • OLE DB Destination → DBA.dbo.RowCountSnapshot.

Part 3. Test

  1. Run once → all matches.
  2. Insert drift on destination:
INSERT INTO Sales.dbo.Customers(Name) VALUES ('DriftTest1'), ('DriftTest2');
  1. Rerun → mismatches logged in RowCountResults.
  2. Confirm email alert (if configured).

Part 4. Report

Check latest mismatches:

SELECT CheckedAtUtc, Notes AS [Schema.Table], SourceCount, DestCount, CountDiff
FROM dbo.RowCountResults
ORDER BY CheckedAtUtc DESC;

Screenshot Guidance

Control Flow screenshot should show:

  • +——————-+
  • | Script Task | –> Generates RunID (Guid.NewGuid)
  • +——————-+
  • |
  • v
  • +——————-+
  • | Execute SQL Task | –> Insert into RowCountRuns
  • +——————-+
  • |
  • v
  • +——————-+
  • | Data Flow Task | –> Snapshot Source (counts from source DB)
  • +——————-+
  • |
  • v
  • +——————-+
  • | Data Flow Task | –> Snapshot Dest (counts from dest DB)
  • +——————-+
  • |
  • v
  • +——————-+
  • | Execute SQL Task | –> Compare_RowCountSnapshots
  • +——————-+
  • |
  • v
  • +——————-+
  • | Execute SQL Task | –> Finalize Run (counts + mismatches)
  • +——————-+
  • |
  • v
  • +——————-+
  • | Execute SQL Task | –> Send Email Alert (optional)
  • +——————-+

Data Flow screenshot for Snapshot Source/Dest should show:

Snapshot Source: Side = "Source", DBName = Sales (or your source).

Snapshot Dest: Side = "Dest", DBName = destination DB.

Even a beginner DBA can recognize boxes/arrows and replicate.


Final Thoughts

  • Start with stored proc + SQL Agent.
  • Use SSIS snapshot workflow if packages are standard in your org.
  • PowerShell is great for quick checks.
  • Row counts = early-warning system; drill deeper only when mismatches are found.

Discover more from SQLYARD

Subscribe to get the latest posts sent to your email.

Leave a Reply

Discover more from SQLYARD

Subscribe now to keep reading and get access to the full archive.

Continue reading