SQL Server: How to Partition a Large Table in a 2-Node Always On AG with Minimal Downtime – SQLYARD

SQL Server: How to Partition a Large Table in a 2-Node Always On AG with Minimal Downtime

By David Yard · SQLYARD.com · April 2026 · Estimated read: 30–35 min

Compatibility: SQL Server 2014 and later with Always On Availability Groups. Table partitioning requires Enterprise Edition. The AG Failover method described here works on any synchronous-commit 2-node AG configuration.

Table of Contents

Scenario Overview
Approach Comparison — Why AG Failover Is Recommended
Impact Summary
Environment Reference
Step 1 — Pre-Flight Checks
Step 2 — Fail Over AG to Node B
Step 3 — Remove Target Database from AG on Node A
Step 4 — Run Partition Work on Node A (Isolated)
Step 5 — Backup Partitioned Database on Node A
Step 6 — Fail Over AG Back to Node A
Step 7 — Reseed Node B
Step 8 — Post-Completion Validation
Rollback Plan
Partition Work Phases Reference
References

Scenario Overview

You have a large production table — in this example approximately 350 GB and 3.4 million rows — that needs to be partitioned by date. The database runs in a 2-node synchronous Always On Availability Group with automatic failover. All application traffic connects through the AG Listener.

The challenge is that partitioning a 350 GB table requires rebuilding the clustered index, dropping and recreating the primary key, and adding nonclustered indexes — all of which are destructive, I/O-intensive operations that take 4–9 hours. Running these operations on the primary node while users are connected is technically possible but practically very damaging.

The AG Failover method solves this by performing all destructive work on an isolated node with zero application connections. The application experiences two brief failovers (~10–30 seconds each) — everything else is zero impact.

Note — Partition Elimination and Query Performance: Partitioning is primarily a data management tool, not a performance tool. Queries against the partitioned table must filter on the partition column (TransactionDate) for partition elimination to work — meaning the query only scans the relevant partition(s) rather than the whole table. If your application queries do not filter on the partition key, SQL Server will scan all partitions, which can make those queries slower than before partitioning. Confirm that your key queries include a WHERE TransactionDate BETWEEN... or equivalent filter before committing to this change.

Approach Comparison — Why AG Failover Is Recommended

Two approaches were evaluated. Option A (run directly on the primary) was initially proposed. Option B (AG Failover method) is the recommended approach after analysis.

Option A — Run on Primary

All work done on the live primary node
Application remains connected throughout
No failovers required
Lower complexity

Option B — AG Failover (Recommended)

Fail to Node B, isolate Node A
All destructive work on isolated node
Two brief failovers (~10–30 sec each)
Zero impact during partition work

Factor	Option A — Run on Primary	Option B — AG Failover
Application downtime	None officially — but 4–9 hours of degraded performance	Two failovers (~10–30 seconds each)
Query performance during work	Severely impacted — clustered index rebuild on 350 GB causes heavy I/O, CPU contention, and lock escalation	Zero impact — all heavy work on isolated node with no users
Index availability during work	PK and NC indexes dropped — queries fall back to full table scans on 350 GB until recreated	Indexes on active node untouched until clean cutover
Blocking risk	High — CREATE CLUSTERED INDEX on 350 GB competes with application queries for locks	None — no application queries on the node being modified
AG resync after work	~350 GB of transaction log must ship to secondary — saturates network and secondary I/O for hours	Clean backup/restore — no log shipping overhead
Impact to secondary (read-only)	Secondary unusable during resync — reporting queries impacted	Secondary gets a clean partitioned copy via restore
HA during work	Full HA maintained throughout	Single node during partition work + reseed (4–12 hours)
Complexity	Lower — fewer steps	Medium — more steps, but each step is safer
Rollback	Restore from backup — 4–9 hour window with partial changes on primary	App on untouched node until cutover — abort at any point before Step 6 with zero impact

Key risk with Option A: While the database technically stays online, rebuilding the clustered index and dropping/recreating PK + NC indexes on a 350 GB table while users are actively querying it causes 3–6 hours of heavy I/O contention, a period where any query using those indexes hits a full table scan, and approximately 350 GB of transaction log shipping to the secondary. This is a functional degradation, not a no-impact change. If Option A is chosen, it must be treated as a full maintenance window with application teams notified.

Impact Summary

What	Duration	When
AG Failover #1 — app moves to Node B	~10–30 seconds	Start of maintenance window
Partition work on isolated Node A	4–9 hours — ZERO app impact	During maintenance window
AG Failover #2 — app moves back to Node A	~10–30 seconds	After partition work completes
Node B reseed (backup/restore)	1–3 hours — ZERO app impact	After failover #2

Environment Reference

The examples throughout this article use the following example names. Substitute your actual values:

Example Environment — Replace With Your Values

Node A (initial primary)	PROD-DBNODE1
Node B (initial secondary)	PROD-DBNODE2
AG Name	PROD-AG01
AG Listener	PROD-AGLISTEN (port 1433)
Target Database	AppDatabase
Target Table	dbo.TransactionDetail (~350 GB)
Partition Column	TransactionDate (DATE)
Other DBs in AG	DB1, DB2, DB3, DB4 (remain protected throughout)
Backup Share	\\backup-share\AppDatabase\

This failover moves all databases in the AG — not just the target database. Confirm all application teams are aware that all AG databases will experience the brief failover disruptions at Steps 2 and 6.

Pre-Flight Checks Full HA

✅ APP IMPACT: NONE — All checks are read-only

Step 1a: Confirm AG Health

Both replicas must show SYNCHRONIZED and CONNECTED before proceeding. Do not start the maintenance window on a degraded AG.

-- Run on either node
SELECT
    ag.name                          AS ag_name,
    ar.replica_server_name,
    ars.role_desc,
    ars.synchronization_health_desc,
    ars.connected_state_desc
FROM sys.dm_hadr_availability_replica_states ars
JOIN sys.availability_replicas ar
    ON ars.replica_id = ar.replica_id
JOIN sys.availability_groups ag
    ON ar.group_id = ag.group_id;

✅ Checkpoint

Both replicas must show SYNCHRONIZED and CONNECTED before proceeding.

Step 1b: Confirm Node Roles

-- Confirm before starting:
-- Node A (PROD-DBNODE1) = PRIMARY
-- Node B (PROD-DBNODE2) = SECONDARY
-- AG Name: PROD-AG01
-- Listener: PROD-AGLISTEN (port 1433)

SELECT replica_server_name, role_desc
FROM sys.dm_hadr_availability_replica_states ars
JOIN sys.availability_replicas ar
    ON ars.replica_id = ar.replica_id;

Step 1c: Full Backup

-- Run on PROD-DBNODE1 (current primary)
BACKUP DATABASE [AppDatabase]
TO DISK = N'\\backup-share\AppDatabase\AppDatabase_PrePartition_FULL.bak'
WITH COMPRESSION, CHECKSUM, STATS = 5;

Step 1d: Disable Transaction Log Backups

CRITICAL — Do this before any work begins. Disable any scheduled transaction log backup jobs for AppDatabase. If a log backup runs between your manual backups and the restore on Node B (Step 7), the log chain will break and the restore will fail.

-- Identify and disable log backup jobs that include AppDatabase
-- Record job names here before disabling:
-- Job Name: ____________________________

-- To disable a job:
EXEC msdb.dbo.sp_update_job
    @job_name = N'YourLogBackupJobName',
    @enabled  = 0;

Step 1e: Record Row Count

-- Record exact count — must match post-partition total in Step 8
SELECT COUNT(*) AS total_rows FROM dbo.TransactionDetail;

✅ Checkpoint

Record the exact row count here: ______________ (expected ~3,400,000)

Fail Over AG to Node B Full HA

⚠️ APP IMPACT: BRIEF DISRUPTION — AG Listener will briefly disconnect all sessions (~10–30 seconds). Applications will automatically reconnect to Node B.

Step 2a: Execute Failover

-- Run on PROD-DBNODE2 (the current secondary)
ALTER AVAILABILITY GROUP [PROD-AG01] FAILOVER;

This failover moves ALL databases in the AG to Node B — not just AppDatabase. All other databases (DB1, DB2, DB3, DB4) will also move. Confirm this is understood and accepted by all application owners before proceeding.

Step 2b: Verify Roles Have Swapped

SELECT replica_server_name, role_desc
FROM sys.dm_hadr_availability_replica_states ars
JOIN sys.availability_replicas ar
    ON ars.replica_id = ar.replica_id;

✅ Checkpoint

PROD-DBNODE2 = PRIMARY. PROD-DBNODE1 = SECONDARY.

Step 2c: Confirm App Connectivity Through Listener

Verify the application is connected and functioning through PROD-AGLISTEN on PROD-DBNODE2 before proceeding to Step 3.

Remove AppDatabase from AG on Node A Single Node — Node B Serving

✅ APP IMPACT: NONE — App is running on Node B via the AG Listener. Node A is being isolated.

Step 3a: Remove AppDatabase from AG on Node A

-- Run on PROD-DBNODE1 (now the secondary)
ALTER DATABASE [AppDatabase] SET HADR OFF;

This only removes AppDatabase from the AG on this node. The other databases (DB1, DB2, DB3, DB4) remain in the AG and remain fully protected.

Step 3b: Recover the Database on Node A

-- The database will be in RESTORING state after removal — recover it
-- Run on PROD-DBNODE1
RESTORE DATABASE [AppDatabase] WITH RECOVERY;

✅ Checkpoint

-- Run on PROD-DBNODE1
SELECT name, state_desc FROM sys.databases WHERE name = 'AppDatabase';

Should show ONLINE.

From this point: AppDatabase runs on PROD-DBNODE2 only via AG Listener. No HA for AppDatabase. The other databases remain fully protected in the AG. This is the expected single-node window — proceed promptly.

Run Partition Work on Node A (Isolated) Single Node — Node B Serving

✅ APP IMPACT: NONE — Node A is completely isolated. No users are connected. All index drops, rebuilds, and PK changes happen here with zero application impact.

Execute all partition phases on PROD-DBNODE1. Because no application connections exist on this node, all operations run at full I/O speed with no contention, no blocking, and no lock escalation concerns.

Phase	Description	Est. Duration	User Impact
Phase 0	Fix nullable columns + verify prerequisites	Minutes	None — isolated node
Phase 1	Create partition function + partition scheme	Minutes	None — isolated node
Phase 2	Rebuild clustered index with partition scheme (~350 GB rebuild)	3–6 hours	None — isolated node
Phase 3	Rebuild primary key + nonclustered indexes partition-aligned	1–3 hours	None — isolated node
Phase 4	Validate partitions + row counts	Minutes	None — isolated node
Phase 5	Archive setup + SWITCH old partitions to archive table	Minutes	None — isolated node
Phase 6	Create SQL Agent maintenance job for ongoing partition management	Minutes	None — isolated node

Phase 1: Create Partition Function and Scheme

-- Run on PROD-DBNODE1 (isolated)
-- Adjust boundary dates to match your retention requirements
CREATE PARTITION FUNCTION pf_TransDetail_Date (DATE)
AS RANGE RIGHT FOR VALUES
(
    '2020-01-01', '2021-01-01', '2022-01-01',
    '2023-01-01', '2024-01-01', '2025-01-01',
    '2026-01-01', '2027-01-01'
);

CREATE PARTITION SCHEME ps_TransDetail_Date
AS PARTITION pf_TransDetail_Date
ALL TO ([PRIMARY]);  -- Adjust to target filegroups as needed

Phase 2: Rebuild Clustered Index with Partition Scheme

-- Drop existing clustered index and rebuild on partition scheme
-- This is the longest-running operation (3–6 hours on 350 GB)
-- No users on this node — runs at full speed

CREATE CLUSTERED INDEX CIX_TransactionDetail_Date
ON dbo.TransactionDetail (TransactionDate, TransactionID)
WITH (DROP_EXISTING = ON, ONLINE = OFF, SORT_IN_TEMPDB = ON)
ON ps_TransDetail_Date (TransactionDate);

Phase 3: Rebuild Primary Key Partition-Aligned

-- Drop the existing PK (non-partitioned)
ALTER TABLE dbo.TransactionDetail
DROP CONSTRAINT PK_TransactionDetail;

-- Recreate PK partition-aligned
ALTER TABLE dbo.TransactionDetail
ADD CONSTRAINT PK_TransactionDetail
PRIMARY KEY CLUSTERED (TransactionDate, TransactionID)
ON ps_TransDetail_Date (TransactionDate);

-- Rebuild any nonclustered indexes partition-aligned
CREATE NONCLUSTERED INDEX IX_TransactionDetail_CustomerID
ON dbo.TransactionDetail (CustomerID)
INCLUDE (TransactionDate, Amount)
ON ps_TransDetail_Date (TransactionDate);

Phase 4: Validate Partitions

-- Confirm 12 partitions listed
SELECT
    p.partition_number,
    p.rows,
    prv.value AS boundary_value
FROM sys.partitions p
JOIN sys.tables t ON p.object_id = t.object_id
LEFT JOIN sys.partition_range_values prv
    ON prv.boundary_id = p.partition_number - 1
WHERE t.name = 'TransactionDetail'
AND p.index_id IN (0, 1)
ORDER BY p.partition_number;

Phase 5: Archive Setup and SWITCH

-- Create archive table (must match schema exactly)
CREATE TABLE dbo.TransactionDetail_Archive
(
    TransactionID   INT          NOT NULL,
    TransactionDate DATE         NOT NULL,
    CustomerID      INT          NULL,
    Amount          DECIMAL(18,2) NULL,
    -- match all columns from main table
    CONSTRAINT PK_TransactionDetail_Archive
        PRIMARY KEY CLUSTERED (TransactionDate, TransactionID)
        ON [PRIMARY],
    CONSTRAINT CK_Archive_Date
        CHECK (TransactionDate < '2023-01-01')
) ON [PRIMARY];

-- Switch old partitions to archive table
-- Partition 1 contains data before 2020, Partition 2 = 2020, etc.
ALTER TABLE dbo.TransactionDetail
SWITCH PARTITION 1 TO dbo.TransactionDetail_Archive;

ALTER TABLE dbo.TransactionDetail
SWITCH PARTITION 2 TO dbo.TransactionDetail_Archive;

Note — Modern Alternative for Ongoing Partition Purging (SQL Server 2016+): The SWITCH-to-archive pattern above is the classic approach and remains fully valid — it gives you a recoverable archive copy of the data before removal. Since SQL Server 2016, you can also use TRUNCATE TABLE...WITH (PARTITIONS(N)) to purge old partitions directly without needing a staging archive table. Both methods work. Use SWITCH when you need to retain an accessible archive copy. Use TRUNCATE with partition number when you simply want to purge old data with the least overhead.

-- SQL Server 2016+ syntax — purge partitions 1 and 2 directly
-- Minimal logging, no archive table required, no SWITCH prerequisites
TRUNCATE TABLE dbo.TransactionDetail WITH (PARTITIONS (1, 2));

This is the recommended approach for routine sliding-window partition maintenance on SQL Server 2016 and later. SWITCH remains the better choice when a recoverable archive copy is required.

✅ All Checkpoints Must Pass Before Step 5

All indexes show ps_TransDetail_Date / PARTITION_SCHEME
Expected number of partitions listed
Archive SWITCH verified — main table partitions switched = 0 rows, archive has data
Total row count (main + archive) matches pre-partition count from Step 1e

Backup Partitioned Database on Node A Single Node — Node B Serving

✅ APP IMPACT: NONE — Backup runs on isolated Node A.

Step 5a: Full Backup

-- Run on PROD-DBNODE1 (isolated, partitioned database)
BACKUP DATABASE [AppDatabase]
TO DISK = N'\\backup-share\AppDatabase\AppDatabase_Partitioned_FULL.bak'
WITH COMPRESSION, CHECKSUM, STATS = 5;

Step 5b: Log Backup

-- Run on PROD-DBNODE1
BACKUP LOG [AppDatabase]
TO DISK = N'\\backup-share\AppDatabase\AppDatabase_Partitioned_LOG.trn'
WITH COMPRESSION, CHECKSUM, STATS = 5;

Fail Over AG Back to Node A Single Node — A Serving After Failover

⚠️ APP IMPACT: BRIEF DISRUPTION — AG Listener will briefly disconnect all sessions (~10–30 seconds) during failover. Applications will automatically reconnect to Node A, which now has the partitioned database.

Step 6a: Remove AppDatabase from AG on Node B

-- Run on PROD-DBNODE2 (current primary)
-- Node B still has the OLD unpartitioned AppDatabase in the AG
ALTER AVAILABILITY GROUP [PROD-AG01] REMOVE DATABASE [AppDatabase];

AppDatabase is now out of the AG on both nodes. The other databases are unaffected. The AG Listener will no longer route AppDatabase connections until it is re-added in Step 6d.

Step 6b: Drop Old AppDatabase on Node B

-- Run on PROD-DBNODE2
-- This drops the old unpartitioned copy — confirmed it was removed from AG in 6a first
DROP DATABASE [AppDatabase];

Step 6c: Fail Over AG to Node A

-- Run on PROD-DBNODE1 (the node with the partitioned database)
ALTER AVAILABILITY GROUP [PROD-AG01] FAILOVER;

This failover moves all remaining databases (DB1, DB2, DB3, DB4) back to PROD-DBNODE1. AppDatabase is not in the AG yet — it will be added in Step 6d.

Step 6d: Add AppDatabase Back to AG from Node A

-- Run on PROD-DBNODE1 (now primary, has the partitioned AppDatabase)
ALTER AVAILABILITY GROUP [PROD-AG01] ADD DATABASE [AppDatabase];

✅ Checkpoint — Node A is Primary, AppDatabase is in AG, App is Connected

-- Confirm AG roles
SELECT replica_server_name, role_desc
FROM sys.dm_hadr_availability_replica_states ars
JOIN sys.availability_replicas ar ON ars.replica_id = ar.replica_id;

-- Confirm AppDatabase synchronization state
SELECT database_name, synchronization_state_desc
FROM sys.dm_hadr_database_replica_states drs
JOIN sys.availability_databases_cluster adc
    ON drs.group_database_id = adc.group_database_id;

PROD-DBNODE1 = PRIMARY. AppDatabase should be listed in the AG.

Step 6e: Confirm App Connectivity Through Listener

Verify the application is connected and functioning through PROD-AGLISTEN, now pointing to PROD-DBNODE1 with the partitioned database.

Reseed Node B Single Node — A Serving

✅ APP IMPACT: NONE — App is running on Node A via the AG Listener. Node B reseed has zero application impact.

Two options to reseed PROD-DBNODE2. Automatic seeding is simpler. Manual seeding (backup/restore) gives more control and is required if automatic seeding is not configured.

Option 1: Automatic Seeding (Recommended)

-- Step 7a: Verify automatic seeding is configured on Node A (primary)
SELECT
    ar.replica_server_name,
    ar.seeding_mode_desc
FROM sys.availability_replicas ar
JOIN sys.availability_groups ag ON ar.group_id = ag.group_id
WHERE ag.name = 'PROD-AG01';

-- If seeding_mode_desc shows MANUAL, change it to AUTOMATIC:
ALTER AVAILABILITY GROUP [PROD-AG01]
MODIFY REPLICA ON N'PROD-DBNODE2'
WITH (SEEDING_MODE = AUTOMATIC);

-- Step 7b: Grant CREATE DATABASE permission on Node B
-- Run on PROD-DBNODE2
ALTER AVAILABILITY GROUP [PROD-AG01] GRANT CREATE ANY DATABASE;

-- Monitor seeding progress from Node A
SELECT
    ag.name                  AS ag_name,
    drs.database_name,
    drs.current_state,
    drs.performed_seeding,
    drs.start_time,
    drs.completion_time,
    drs.failure_state_desc
FROM sys.dm_hadr_automatic_seeding drs
JOIN sys.availability_groups ag ON drs.ag_id = ag.group_id
WHERE drs.database_name = 'AppDatabase';

Option 2: Manual Seeding (Backup/Restore)

Use this if automatic seeding is not available or preferred. Log backup jobs for AppDatabase should already be disabled from Step 1d.

-- Run on PROD-DBNODE1 (primary)
BACKUP DATABASE [AppDatabase]
TO DISK = N'\\backup-share\AppDatabase\AppDatabase_Partitioned_Reseed_FULL.bak'
WITH COMPRESSION, CHECKSUM, STATS = 5;

BACKUP LOG [AppDatabase]
TO DISK = N'\\backup-share\AppDatabase\AppDatabase_Partitioned_Reseed_LOG.trn'
WITH COMPRESSION, CHECKSUM, STATS = 5;

-- Run on PROD-DBNODE2
RESTORE DATABASE [AppDatabase]
FROM DISK = N'\\backup-share\AppDatabase\AppDatabase_Partitioned_Reseed_FULL.bak'
WITH NORECOVERY, STATS = 5;

RESTORE LOG [AppDatabase]
FROM DISK = N'\\backup-share\AppDatabase\AppDatabase_Partitioned_Reseed_LOG.trn'
WITH NORECOVERY, STATS = 5;

-- Join to AG on Node B
ALTER DATABASE [AppDatabase]
SET HADR AVAILABILITY GROUP = [PROD-AG01];

Step 7c: Re-enable Transaction Log Backups

Do not forget: Re-enable the transaction log backup jobs for AppDatabase that were disabled in Step 1d.

EXEC msdb.dbo.sp_update_job
    @job_name = N'YourLogBackupJobName',
    @enabled  = 1;

✅ Checkpoint — AG Healthy, Both Nodes Synchronized

SELECT
    ar.replica_server_name,
    ars.role_desc,
    drs.synchronization_state_desc,
    drs.synchronization_health_desc
FROM sys.dm_hadr_database_replica_states drs
JOIN sys.availability_replicas ar ON drs.replica_id = ar.replica_id
JOIN sys.dm_hadr_availability_replica_states ars ON ars.replica_id = ar.replica_id
WHERE drs.database_id = DB_ID('AppDatabase');

Both PROD-DBNODE1 and PROD-DBNODE2 should show SYNCHRONIZED and HEALTHY.

✅ Full HA Restored — All databases in PROD-AG01 synchronized on both nodes.

Post-Completion Validation Full HA Restored

✅ APP IMPACT: NONE — All validation queries are read-only.

Step 8a: Verify Partitions on Both Nodes

-- Run on BOTH PROD-DBNODE1 and PROD-DBNODE2
SELECT
    p.partition_number,
    p.rows,
    prv.value AS boundary_value
FROM sys.partitions p
JOIN sys.tables t ON p.object_id = t.object_id
LEFT JOIN sys.partition_range_values prv
    ON prv.boundary_id = p.partition_number - 1
WHERE t.name = 'TransactionDetail'
AND p.index_id IN (0, 1)
ORDER BY p.partition_number;

Step 8b: Verify Total Row Count Matches

SELECT
    (SELECT COUNT(*) FROM dbo.TransactionDetail)
    + (SELECT COUNT(*) FROM dbo.TransactionDetail_Archive)
    AS total_rows;
-- Must match pre-partition count recorded in Step 1e

Step 8c: Verify All Indexes Are Partition-Aligned

SELECT
    i.name            AS index_name,
    ds.name           AS data_space_name,
    ds.type_desc
FROM sys.indexes i
JOIN sys.data_spaces ds ON i.data_space_id = ds.data_space_id
WHERE i.object_id = OBJECT_ID('dbo.TransactionDetail')
ORDER BY i.index_id;
-- Every index should show ps_TransDetail_Date / PARTITION_SCHEME

Step 8d: Verify SQL Agent Maintenance Job on Both Nodes

SELECT name, enabled, date_created
FROM msdb.dbo.sysjobs
WHERE name = 'AppDatabase_TransDetail_Partition_Maintenance';
-- Note: if job only exists on Node A, run Phase 6 from the partition runbook on Node B

Step 8e: Application Smoke Test

Confirm the application is functioning correctly through the AG Listener against the partitioned database. Run representative queries and confirm expected response times.

Rollback Plan

Point in Process	Rollback Action	App Impact
Before Step 2 (before any failover)	Nothing to roll back — no changes made yet	None
After Step 2, before Step 4 (partition work)	Fail back to Node A (reverse Step 2). App on untouched Node A the whole time.	~10–30 sec failover
During Step 4 (partition work in progress)	Stop work on Node A. Fail back to Node A. App was on Node B — still unaffected.	~10–30 sec failover
After Step 6 (partitioned DB is live)	Restore from pre-partition backup (AppDatabase_PrePartition_FULL.bak). This is a longer recovery window — confirm backup is accessible before starting.	Maintenance window required

The key rollback advantage of Option B: At any point before Step 6, the application is running on the untouched node with the original unmodified database. You can abort with zero application impact and zero data risk.

Partition Work Phases Reference

All phases run on the isolated Node A with no application connections. Estimated total duration: 4–9 hours depending on hardware and data volume.

Phase	Description	Key Operations	Duration
Phase 0	Prerequisites	Fix nullable partition column, verify no replication conflicts	Minutes
Phase 1	Partition infrastructure	CREATE PARTITION FUNCTION, CREATE PARTITION SCHEME	Minutes
Phase 2	Clustered index rebuild	DROP + CREATE CLUSTERED INDEX on partition scheme (~350 GB)	3–6 hours
Phase 3	PK + NC index rebuild	DROP + recreate PRIMARY KEY partition-aligned, rebuild NC indexes	1–3 hours
Phase 4	Validation	Verify partition count, row counts, index alignment	Minutes
Phase 5	Archive setup + SWITCH	Create archive table, SWITCH old partitions, verify data	Minutes
Phase 6	Maintenance job	Create SQL Agent job for ongoing partition maintenance	Minutes

References

Discover more from SQLYARD

Subscribe to get the latest posts sent to your email.

SQL Server: How to Partition a Large Table in a 2-Node Always On AG with Minimal Downtime

Scenario Overview

Approach Comparison — Why AG Failover Is Recommended

Option A — Run on Primary

Option B — AG Failover (Recommended)

Impact Summary

Environment Reference

Pre-Flight Checks Full HA

Step 1a: Confirm AG Health

Step 1b: Confirm Node Roles

Step 1c: Full Backup

Step 1d: Disable Transaction Log Backups

Step 1e: Record Row Count

Fail Over AG to Node B Full HA

Step 2a: Execute Failover

Step 2b: Verify Roles Have Swapped

Step 2c: Confirm App Connectivity Through Listener

Remove AppDatabase from AG on Node A Single Node — Node B Serving

Step 3a: Remove AppDatabase from AG on Node A

Step 3b: Recover the Database on Node A

Run Partition Work on Node A (Isolated) Single Node — Node B Serving

Phase 1: Create Partition Function and Scheme

Phase 2: Rebuild Clustered Index with Partition Scheme

Phase 3: Rebuild Primary Key Partition-Aligned

Phase 4: Validate Partitions

Phase 5: Archive Setup and SWITCH

Backup Partitioned Database on Node A Single Node — Node B Serving

Step 5a: Full Backup

Step 5b: Log Backup

Fail Over AG Back to Node A Single Node — A Serving After Failover

Step 6a: Remove AppDatabase from AG on Node B

Step 6b: Drop Old AppDatabase on Node B

Step 6c: Fail Over AG to Node A

Step 6d: Add AppDatabase Back to AG from Node A

Step 6e: Confirm App Connectivity Through Listener

Reseed Node B Single Node — A Serving

Option 1: Automatic Seeding (Recommended)

Option 2: Manual Seeding (Backup/Restore)

Step 7c: Re-enable Transaction Log Backups

Post-Completion Validation Full HA Restored

Step 8a: Verify Partitions on Both Nodes

Step 8b: Verify Total Row Count Matches

Step 8c: Verify All Indexes Are Partition-Aligned

Step 8d: Verify SQL Agent Maintenance Job on Both Nodes

Step 8e: Application Smoke Test

Rollback Plan

Partition Work Phases Reference

References

Share this:

Like this:

Related

Discover more from SQLYARD

Related Posts

Leave a ReplyCancel reply

Discover more from SQLYARD