Top 20 SQL Server Interview Questions for Senior DBAs and Data Engineers in 2026
The interview questions for senior SQL Server roles have changed significantly in the last two years. The foundational questions about indexes, blocking, and stored procedures are still there. But 2026 interviews now include questions about AI integration, vector data, MCP architecture, SQL Server 2025 features, and cloud platform trade-offs that simply did not exist at the senior level two years ago.
This article covers 20 questions you should be prepared to answer at the senior level. Ten are focused on traditional DBA responsibilities. Ten cover the data engineering and AI-adjacent skills that senior roles increasingly require. For each question we give the complete senior-level answer, explain what a strong answer demonstrates, and flag the weak answers that junior candidates give that you need to avoid.
How to use this article: Read through each question and answer before practicing your verbal response. The written answers here are complete. Your verbal answer in an interview should hit the same key points in two to three minutes. The strongest answers demonstrate not just what something is, but why it works that way, when you would and would not use it, and what you have seen go wrong in production.
- Stored Procedure vs Function: What Is the Real Difference?
- How Do You Investigate a Sudden Query Performance Regression?
- Blocking vs Deadlock: How Do You Diagnose and Fix Each?
- What Is RCSI and When Would You Not Enable It?
- How Does NUMA Affect SQL Server on a Virtual Machine?
- What Changed With Optimized Locking in SQL Server 2025?
- How Do You Approach Fragmentation on SSD vs Spinning Disk?
- Synchronous vs Asynchronous Commit in Always On: When Does the Choice Matter?
- What Is Delayed Durability and What Are the Risks?
- How Do You Find a Long-Running Query That Is No Longer Running?
- ROW vs PAGE vs COLUMNSTORE: How Do You Choose?
- What Is a Vector Embedding and How Does SQL Server 2025 Store One?
- What Is RAG and What Does SQL Server Have to Do With It?
- What Is MCP and Why Does the DBA Own the Security Layer?
- Partitioning vs Compression: When Do You Use Both?
- Why Does the Community Warn Against MERGE and What Do You Use Instead?
- How Do You Design a Schema for an AI Application in SQL Server?
- What Is the Difference Between Clustered and Non-Clustered Indexes and How Does Fill Factor Affect Each?
- How Would You Migrate a 10TB SQL Server Database to Azure With Minimal Downtime?
- What Does Good Metadata Quality Mean in 2026 and Why Does It Matter More Than It Used To?
The surface-level difference is that functions return a value and can be used in SELECT statements, while stored procedures return result sets and are called with EXEC. But the senior answer goes deeper than that.
Functions in SQL Server are divided into scalar functions, inline table-valued functions (iTVF), and multi-statement table-valued functions (msTVF). The performance characteristics differ dramatically across these types. Scalar UDFs have historically been some of the worst performance offenders in SQL Server because they execute once per row, prevent parallelism, and cause serial execution plans. SQL Server 2019 introduced scalar UDF inlining which allows the optimizer to inline eligible scalar functions directly into the query plan, eliminating the per-row execution penalty. But the inlining has eligibility requirements and not all UDFs qualify.
Inline table-valued functions are a different story entirely. They are essentially parameterized views and the optimizer treats them as such, folding them into the outer query plan. They can use parallelism, benefit from statistics, and produce efficient plans. When a scalar UDF can be rewritten as an iTVF, performance almost always improves.
Stored procedures cannot be used in SELECT, WHERE, or JOIN clauses. Functions can. But functions have significant restrictions: they cannot modify data (except for some table-valued functions with SCHEMABINDING), cannot call non-deterministic functions like GETDATE() in certain contexts like indexed views, cannot use TRY/CATCH internally, and cannot execute dynamic SQL.
-- Scalar UDF: executes once per row, prevents parallelism
CREATE FUNCTION dbo.GetCustomerStatus (@CustomerID INT)
RETURNS NVARCHAR(20)
AS BEGIN
DECLARE @Status NVARCHAR(20);
SELECT @Status = Status FROM dbo.Customers WHERE CustomerID = @CustomerID;
RETURN @Status;
END
-- Inline TVF: folded into query plan, can use parallelism
CREATE FUNCTION dbo.GetCustomerStatusTVF (@CustomerID INT)
RETURNS TABLE AS RETURN (
SELECT Status FROM dbo.Customers WHERE CustomerID = @CustomerID
);
-- Check if a scalar UDF is being inlined (SQL Server 2019+)
SELECT name, is_inlineable, inline_type
FROM sys.sql_modules m
JOIN sys.objects o ON o.object_id = m.object_id
WHERE o.type = 'FN';
Knowledge of scalar UDF inlining in 2019+, the iTVF vs msTVF performance difference, and a real example of when you replaced a scalar UDF with an iTVF and what the plan change looked like.
“A function returns a value and a stored procedure doesn’t.” This is the textbook answer. It tells the interviewer you know the definition but not the operational implications.
The first thing to determine is whether the plan changed or the data changed. These are different problems with different solutions and you cannot tell from query duration alone which one you are dealing with.
Start with Query Store. Query Store tracks execution plans and their runtime statistics over time. If the query has a new plan, Query Store will show you the old plan, the new plan, and when the change happened. If the old plan was faster you can force it immediately while investigating the root cause.
-- Query Store: find queries with plan changes and performance regression
SELECT TOP 20
qsq.query_id,
qsqt.query_sql_text,
qsp.plan_id,
CAST(qsp.query_plan AS XML) AS QueryPlan,
qsrs.avg_duration / 1000.0 AS AvgDurationMs,
qsrs.avg_logical_io_reads,
qsrs.count_executions,
qsrs.last_execution_time
FROM sys.query_store_query qsq
JOIN sys.query_store_query_text qsqt ON qsqt.query_text_id = qsq.query_text_id
JOIN sys.query_store_plan qsp ON qsp.query_id = qsq.query_id
JOIN sys.query_store_runtime_stats qsrs ON qsrs.plan_id = qsp.plan_id
ORDER BY qsrs.avg_duration DESC;
-- Force the better plan if you find a regression
EXEC sys.sp_query_store_force_plan @query_id = 42, @plan_id = 7;
If the plan did not change, the next question is whether the data volume changed. A query with a good plan on one million rows may perform poorly when the table has grown to fifty million rows because the plan was compiled with statistics that no longer reflect the actual data distribution. Update statistics and recompile.
If neither explains it, look at wait statistics for the specific query: is it waiting on I/O, CPU, locking, or memory grants? The wait type narrows the investigation significantly.
Methodical diagnostic process starting with Query Store, distinguishing plan regression from data growth, and knowing how to force a plan as an immediate mitigation while the root cause is investigated.
Blocking is one session waiting for another to release a lock. It resolves on its own when the blocking session commits or rolls back. No error is thrown. The waiting session simply queues. Blocking becomes a problem when the wait is long, which usually means a transaction is holding locks longer than necessary.
A deadlock is a circular blocking condition that can never resolve itself. Session 1 holds a lock on Resource A and wants Resource B. Session 2 holds a lock on Resource B and wants Resource A. SQL Server’s lock monitor detects the cycle within five seconds, kills one session as the deadlock victim, rolls back its transaction, and raises error 1205 to that session.
-- Diagnose BLOCKING in real time
SELECT r.session_id, r.blocking_session_id,
r.wait_type, r.wait_time / 1000 AS WaitSeconds,
LEFT(qt.text, 300) AS BlockedQuery
FROM sys.dm_exec_requests r
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) qt
WHERE r.blocking_session_id > 0;
-- Diagnose DEADLOCKS from system_health (no setup required)
SELECT xdr.value('@timestamp','datetime2') AS DeadlockTime,
xdr.query('.') AS DeadlockXML
FROM (SELECT CAST(xst.target_data AS XML) AS td
FROM sys.dm_xe_session_targets xst
JOIN sys.dm_xe_sessions xs ON xs.address = xst.event_session_address
WHERE xs.name = 'system_health' AND xst.target_name = 'ring_buffer') d
CROSS APPLY td.nodes('//RingBufferTarget/event[@name="xml_deadlock_report"]') t(xdr)
ORDER BY 1 DESC;
The key operational difference: blocking is visible in DMVs while it is happening. Deadlocks are over in seconds and must be captured proactively through Extended Events before they disappear.
Understanding that blocking resolves itself while deadlocks require SQL Server intervention, the specific DMVs and XE session for each, and knowing that deadlocks must be captured proactively.
Read Committed Snapshot Isolation eliminates reader-writer blocking by having readers access a consistent snapshot of data from the version store in tempdb rather than taking shared locks. Writers do not block readers and readers do not block writers. For OLTP databases with high read-write concurrency it dramatically reduces blocking.
The cases where you would not enable it or should be cautious:
- TempDB capacity constraints. RCSI stores row versions in tempdb. On a high-write workload the version store can grow significantly. If tempdb is undersized or on slow storage, RCSI adds overhead rather than removing it.
- HOLDLOCK combinations. If the database has MERGE statements or other code using HOLDLOCK hints, enabling RCSI can introduce deadlocks that did not exist before. HOLDLOCK forces SERIALIZABLE isolation which takes range locks, and those range locks interact poorly with RCSI in some concurrency patterns.
- Workloads that need to see uncommitted data intentionally. Some reporting workloads use NOLOCK hints to accept dirty reads in exchange for avoiding blocking. Under RCSI the behavior changes subtly and assumptions about data freshness may not hold.
- Writer-writer blocking is not addressed. RCSI only eliminates reader-writer conflicts. If the blocking problem is two writers contending on the same rows, RCSI provides no benefit there.
The ability to articulate not just what RCSI does but where it introduces new problems. Most candidates say “enable RCSI to fix blocking.” Senior candidates know the tempdb trade-off, the HOLDLOCK interaction, and that writer-writer blocking is unaffected.
NUMA stands for Non-Uniform Memory Access. In modern multi-socket servers each physical CPU socket has its own bank of memory directly attached. A CPU can access its local memory very fast and can also access memory on a different socket, but that remote access is significantly slower because it must cross the inter-socket interconnect.
SQL Server is NUMA-aware. At startup it detects the NUMA topology, creates a memory manager per NUMA node, and assigns worker threads to nodes to keep memory access local. This optimization is critical for large multi-socket servers where remote memory access overhead is measurable.
On a virtual machine the problem is that the VM’s socket configuration may not reflect the physical NUMA topology of the host. A VM configured as 1 socket with 8 cores running on a dual-socket physical host means SQL Server sees one NUMA node. But the hypervisor may schedule the VM’s vCPUs across both physical sockets to balance load. SQL Server’s memory allocation is based on one NUMA node but the actual CPU work is happening on two physical NUMA nodes, causing remote memory access that SQL Server cannot detect or compensate for.
-- Check what SQL Server sees about the hardware topology
SELECT cpu_count, socket_count, cores_per_socket,
numa_node_count, softnuma_configuration_desc
FROM sys.dm_os_sys_info;
-- Check for remote NUMA memory access (sign of misconfiguration)
SELECT memory_node_id, pages_kb / 1024 AS LocalMemoryMB,
foreign_committed_kb / 1024 AS RemoteMemoryMB
FROM sys.dm_os_memory_nodes
WHERE memory_node_id < 64;
-- High foreign_committed_kb = SQL Server accessing remote NUMA memory
The fix is to configure the VM with 2 virtual sockets aligned to the physical NUMA topology. For a VM with 8 vCPUs on a dual-socket host: 2 sockets x 4 cores, not 1 socket x 8 cores.
Understanding why the VM socket configuration matters, the foreign_committed_kb diagnostic, and the specific fix of aligning virtual sockets to physical NUMA nodes.
Traditional SQL Server locking acquires row locks for the duration of a transaction. Under high concurrency this creates blocking chains as transactions hold locks while doing other work. Even with RCSI reducing reader-writer conflicts, writer-writer conflicts remain.
Optimized Locking, introduced in SQL Server 2025 and available on Azure SQL Database, changes how row-level locks work. Rather than holding row locks for the full transaction duration, the engine uses a Transaction ID (TID) lock mechanism. A writer acquires a short-duration row lock to make the modification, then converts it to a TID lock. Other transactions trying to modify the same row check the TID lock to determine if the modifying transaction has committed or rolled back rather than waiting for the row lock to be released. This dramatically reduces lock hold times and contention on hot rows.
The two key requirements for Optimized Locking to be active are that RCSI must be enabled and the database must be at the appropriate compatibility level. You can verify it is active through sys.dm_tran_locks and observe significantly reduced lock counts on high-concurrency workloads.
Knowing this feature exists in SQL Server 2025, understanding it requires RCSI, and being able to explain the TID lock mechanism rather than just saying "it reduces blocking."
Not knowing this feature exists at all. In 2026 a senior DBA interview should demonstrate awareness of SQL Server 2025 features even if you have not used them in production yet.
The traditional thresholds (5% reorganize, 30% rebuild) were designed for spinning disk. On HDD, fragmentation causes the read head to seek to scattered locations which is physically slow and measurable. The maintenance thresholds were calibrated to the point where the I/O scatter cost exceeded the maintenance overhead cost.
On SSD and NVMe there is no read head and no seek time. Random page access costs the same as sequential access. Page scatter from fragmentation has minimal impact on read performance. Microsoft's own documentation states that fragmentation has minimal impact on query performance on solid state drives.
The revised approach for SSD: raise the reorganize threshold significantly (30% or higher) or eliminate reorganize entirely and only rebuild when fragmentation exceeds 60% or when page fullness has dropped enough to increase logical read counts meaningfully. The goal on SSD is page fullness, not page order. A table with 35% fragmentation but good page fullness on SSD is not a maintenance priority.
What does still matter on SSD: write amplification. Frequent unnecessary index rebuilds generate write I/O that consumes SSD endurance without improving performance. Aggressive maintenance on SSD actually harms the storage without benefiting queries.
Understanding why the traditional thresholds exist and that they do not apply equally to flash storage, knowing that page fullness still matters on SSD while page order mostly does not, and awareness of write amplification as an SSD concern.
Synchronous commit means the primary does not acknowledge a commit to the client until the secondary has hardened the log record to disk. Zero data loss on failover. The cost is that every commit waits for the round-trip to the secondary, which adds latency proportional to the network distance between primary and secondary.
Asynchronous commit means the primary acknowledges commits without waiting for the secondary. No per-commit latency penalty. The cost is potential data loss on failover if the secondary lags behind the primary at the moment of failure.
The choice matters most in three scenarios. First, for financial or compliance workloads where zero data loss is a regulatory requirement, synchronous commit is mandatory and the latency cost must be accepted and sized for. Second, for geo-distributed secondaries where the network round-trip is hundreds of milliseconds, synchronous commit would make that secondary unusable as a commit target because every commit would pay hundreds of milliseconds of latency. Asynchronous is the only practical choice for distant secondaries. Third, for high-frequency OLTP workloads where commit rate is thousands per second, even a few milliseconds of synchronous latency compounds into a meaningful throughput ceiling.
A common architecture uses both: a synchronous secondary in the same data center for zero-data-loss failover with fast round-trips, and an asynchronous secondary in a remote data center for disaster recovery without paying the geo-latency on every commit.
Understanding the latency implications of synchronous commit and being able to describe a real-world architecture that uses both modes for different secondaries serving different purposes.
Delayed Durability changes when SQL Server acknowledges a commit to the client. Normally SQL Server writes the log record to disk before acknowledging the commit. With Delayed Durability enabled, SQL Server writes the log record to the in-memory log buffer and immediately acknowledges the commit. The actual disk write happens asynchronously when the buffer fills to 60KB, on a checkpoint, or when a fully durable transaction commits.
The performance gain is real: eliminating the synchronous log write removes WRITELOG wait time and can significantly increase throughput on commit-bound workloads. The risk is equally real: any transactions that were acknowledged but not yet flushed to disk are lost if the server crashes. The application received a successful commit response. The data does not exist after the crash. There is no recovery path for those transactions.
The workloads where this trade-off is acceptable are logging and audit tables, telemetry pipelines where the authoritative source is elsewhere, staging tables that can be reloaded, and session state tables. The workloads that must never use it are financial transactions, compliance data, and anything where the database is the authoritative record.
The FORCED database setting is particularly dangerous because it applies Delayed Durability to every transaction in the database regardless of how the transaction was coded, including transactions that explicitly expected full durability.
Explaining the exact mechanism (log buffer, 60KB flush threshold), naming specific appropriate and inappropriate workloads, and distinguishing ALLOWED (opt-in per transaction) from FORCED (applies to everything).
This is a common production scenario where a problem has resolved by the time you investigate it. The live DMVs show nothing because the query is gone. The tools that help here are Query Store, the default trace, Extended Events history, and wait statistics.
Query Store is the primary tool. It stores execution plan history and runtime statistics including average and max duration, logical reads, and CPU time. Filter by the time window when the problem occurred and sort by average duration or total CPU to find the candidates.
-- Find top resource consumers in a specific time window via Query Store
SELECT TOP 20
qsqt.query_sql_text,
qsrs.avg_duration / 1000.0 AS AvgDurationMs,
qsrs.max_duration / 1000.0 AS MaxDurationMs,
qsrs.avg_logical_io_reads,
qsrs.count_executions,
CAST(qsp.query_plan AS XML) AS Plan,
qsrs.last_execution_time
FROM sys.query_store_runtime_stats qsrs
JOIN sys.query_store_plan qsp ON qsp.plan_id = qsrs.plan_id
JOIN sys.query_store_query qsq ON qsq.query_id = qsp.query_id
JOIN sys.query_store_query_text qsqt ON qsqt.query_text_id = qsq.query_text_id
WHERE qsrs.last_execution_time BETWEEN '2026-06-04 09:00' AND '2026-06-04 10:00'
ORDER BY qsrs.avg_duration DESC;
The default trace captures auto-grow events, errors, and other notable events with timestamps. If the problem was caused by a disk growth event or a lock timeout, the default trace may have it. Wait statistics show cumulative waits since last restart and while they do not isolate a specific query, a spike in WRITELOG or PAGEIOLATCH waits during the problem window confirms the category of issue.
Query Store as the primary post-mortem tool with the time-window filter, combined with awareness of default trace and wait statistics for corroborating evidence.
ROW compression stores fixed-length data types in variable-length format, removing trailing zeros and padding. Lower CPU cost, moderate storage savings. Good for tables with many fixed-length columns where actual values are shorter than the declared type.
PAGE compression is a superset of ROW. It adds prefix compression and dictionary compression, storing common column prefixes and repeated values once per page. Significantly better storage savings but higher CPU cost on every read and write. Best for read-heavy tables with repetitive data patterns: status codes, category IDs, date ranges, reference data.
COLUMNSTORE is a fundamentally different storage model. Data is stored by column rather than by row. It achieves 10x or more compression on analytical workloads and dramatically improves performance for aggregation queries that touch only a few columns from wide tables. It is not appropriate for OLTP tables with frequent single-row updates and inserts.
The decision framework: write-heavy OLTP tables generally should not use PAGE compression because the CPU cost on every write outweighs the storage benefit. Read-heavy tables with repetitive values are PAGE candidates. Large fact tables or historical data tables with analytical access patterns are COLUMNSTORE candidates. Always validate the projected savings with sp_estimate_data_compression_savings before applying to production.
Understanding the CPU vs storage trade-off for each type, knowing COLUMNSTORE is a different storage model not just better compression, and mentioning sp_estimate_data_compression_savings as the validation step.
A vector embedding is a numerical representation of text, an image, or any other content as a list of floating-point numbers. An embedding model converts a piece of text into a dense vector, typically 1,536 or 3,072 numbers, where the position of the vector in the high-dimensional space reflects the semantic meaning of the content. Documents that are semantically similar produce vectors that are close together in that space. This makes it possible to search for semantically similar content by finding vectors that are nearest to a query vector, rather than searching for exact keyword matches.
SQL Server 2025 introduces a native VECTOR data type that stores embeddings as a typed column. The DiskANN vector index enables approximate nearest neighbor search directly in SQL Server without requiring a separate vector database.
-- SQL Server 2025: create a table with a vector column
CREATE TABLE dbo.DocumentChunks (
ChunkID INT IDENTITY(1,1) PRIMARY KEY,
ChunkText NVARCHAR(MAX) NOT NULL,
Embedding VECTOR(1536) NOT NULL -- 1536-dimension embedding
);
-- Create DiskANN vector index for similarity search
CREATE VECTOR INDEX IX_Chunks_Embedding
ON dbo.DocumentChunks (Embedding)
WITH (METRIC = 'cosine', TYPE = 'diskann');
-- Semantic similarity search using cosine distance
SELECT TOP 5
ChunkID, ChunkText,
VECTOR_DISTANCE('cosine', Embedding, CAST(@QueryVector AS VECTOR(1536))) AS Distance
FROM dbo.DocumentChunks
ORDER BY Distance ASC;
Explaining what a vector embedding represents semantically, not just technically, and knowing that SQL Server 2025 supports this natively with DiskANN indexing rather than requiring a separate vector database.
RAG stands for Retrieval Augmented Generation. It is the pattern that connects an LLM to a company's own knowledge base. An LLM out of the box knows nothing about your organization. RAG solves this by retrieving relevant documents from your knowledge base before generating a response. The LLM answers based on the retrieved context rather than from general training data alone, dramatically reducing hallucinations on company-specific questions.
SQL Server 2025 is well-positioned as the knowledge base storage layer for RAG. Documents are chunked, converted to vector embeddings using an embedding model API, and stored in a VECTOR table with a DiskANN index. At query time the user's question is converted to a vector, a similarity search finds the most relevant document chunks, and those chunks are passed to the LLM as context alongside the question.
The DBA's role in RAG is significant. Schema design for the vector table, classification columns for row-level security on retrieved content, freshness monitoring for stale documents, and vector index maintenance are all database responsibilities. The quality of the LLM's answers is directly tied to the quality of the data in the knowledge base.
Explaining the retrieval step (not just "it connects LLMs to data"), understanding SQL Server 2025's native role in the architecture, and articulating the DBA's specific responsibilities in the RAG pipeline.
MCP stands for Model Context Protocol. It is the open standard that allows AI clients (Claude, Amazon Q, custom AI applications) to connect to external systems including databases in a structured, governed way. An MCP server sits between the AI and your SQL Server instance, defines which operations the AI is allowed to perform, and logs every action.
The DBA owns MCP security because it is a database access problem dressed in new clothes. Configuring an MCP server for SQL Server requires creating a dedicated read-only SQL account with least-privilege permissions, defining which tables and schemas the AI can query, blocking dangerous keywords like DROP, DELETE, and TRUNCATE at the protocol level, setting row caps to prevent runaway queries from returning millions of rows, and implementing audit logging for every tool call.
These are exactly the skills DBAs already have. The difference from traditional access control is that the MCP client is an AI that generates SQL dynamically based on natural language queries rather than a fixed application sending known parameterized queries. The governance requirements are actually higher, not lower, because the query surface is open-ended.
Understanding MCP as a database access layer rather than just an AI technology, and being able to articulate the specific security controls (least-privilege accounts, keyword blocking, row caps, audit logging) that the DBA is responsible for.
Partitioning and compression solve different problems and combining them is more powerful than either alone for large growing tables.
Partitioning by date divides the table into independently maintainable units. Historical partitions are read-heavy and rarely modified. Active partitions receive constant inserts and updates. The key insight is that you can apply different compression to different partitions, which is impossible on an unpartitioned table.
The combined strategy: partition the table by date first. Apply PAGE compression to historical partitions where the data is read-heavy, compressible, and the CPU cost of maintaining compression is low because writes are rare. Apply ROW compression or no compression to the active partition where write frequency is high and PAGE compression CPU cost is harder to justify. This gives you maintenance efficiency from partitioning and optimal storage savings from per-partition compression calibrated to the actual workload of each partition.
-- Apply different compression to different partitions
-- Historical data: PAGE compression (read-heavy, rarely modified)
ALTER TABLE dbo.Orders REBUILD PARTITION = 1
WITH (DATA_COMPRESSION = PAGE);
-- Active partition: ROW compression (high write rate)
ALTER TABLE dbo.Orders REBUILD PARTITION = 12
WITH (DATA_COMPRESSION = ROW);
The order matters: partition first, then compress. Compressing a large unpartitioned table and then partitioning requires a full rebuild cycle that could have been avoided.
Understanding that different partitions can have different compression, the reasoning for why historical and active partitions warrant different compression types, and the importance of partitioning before compressing.
MERGE has a documented history of bugs and race conditions. The primary operational concern is that MERGE without the HOLDLOCK hint on the target is vulnerable to race conditions under concurrent load. Two sessions can both determine that a row does not exist, both proceed to the INSERT branch, and both attempt to insert the same key, causing a primary key violation. This is not a theoretical edge case. It appears in production systems as intermittent key violations under load.
The fix is to always use HOLDLOCK: MERGE dbo.Table WITH (HOLDLOCK) AS target. But HOLDLOCK forces SERIALIZABLE isolation which increases lock contention and on databases with RCSI enabled, HOLDLOCK can introduce deadlocks that did not exist before.
For single-row upsert operations the safer alternative is the update-first pattern: attempt the UPDATE, check @@ROWCOUNT, if zero then INSERT with TRY/CATCH to handle the rare concurrent insert case. This is more readable, more testable, and has no dependence on remembering a hint. MERGE with HOLDLOCK remains appropriate for batch synchronization ETL loads where the multi-branch syntax genuinely simplifies the logic.
Knowing the specific race condition mechanism, the HOLDLOCK requirement and its cost, the RCSI interaction, and being able to write the alternative pattern from memory.
An AI application in 2026 has different schema requirements than a traditional OLTP application. The key additions are vector storage for embeddings, metadata columns for AI retrieval quality, and agent memory patterns for stateful multi-step workflows.
For the knowledge base: a VECTOR(1536) column for embeddings, a DiskANN vector index, classification columns for row-level security on retrieval, document metadata (owner, last verified date, document type, department), and an IsActive flag for logical deletion rather than physical deletion which preserves audit history.
Column naming and extended property documentation are now first-class design requirements. The LLM generating SQL against your schema reads your table and column names to understand your data. A column named c4 teaches the LLM nothing. A column named OrderStatusID with an extended property describing the valid values and their business meanings produces correct SQL on the first attempt. Schema documentation quality directly determines AI output quality.
For agent memory: a conversation history table storing session context with TTL-based cleanup, a tool call log for audit purposes, and a separate schema for AI-specific objects to keep them organizationally distinct from application data.
Connecting metadata quality and extended properties to AI output quality. Most candidates think about vector storage but miss the metadata documentation angle which is the most impactful design decision for AI application performance.
A clustered index determines the physical storage order of the data rows. There can only be one clustered index per table because the data can only be physically ordered one way. The leaf level of a clustered index contains the actual data rows. A non-clustered index is a separate structure from the data. Its leaf level contains the index key columns and a pointer back to the data row (either the clustered index key or a row identifier for heaps).
The clustered index key choice is critical for insert performance. A sequential key (identity integer, sequential GUID) means new rows always insert at the right end of the index with no page splits. A random key (NEWID() GUID) means new rows insert at random positions throughout the index, causing page splits on almost every insert and generating fragmentation rapidly.
Fill factor determines what percentage of each index page is filled when the index is built or rebuilt. A fill factor of 80 leaves 20% of each page empty as reserved space for future inserts and updates. This reserved space delays page splits. On a clustered index with random inserts, a lower fill factor (70-80) reduces page split frequency. On a clustered index with sequential inserts, a higher fill factor (95-100) is appropriate because pages fill from the right end without splits.
The key nuance is that fill factor only takes effect at build or rebuild time. It does not continuously maintain page fullness. After a rebuild pages start at the fill factor level and then fill naturally as data changes. The right rebuild frequency depends on how quickly the reserved space is consumed by normal write activity.
The relationship between clustered key choice and fill factor requirements, understanding that fill factor is applied at rebuild not maintained continuously, and the sequential vs random key performance difference.
A 10TB database makes any approach that requires copying all the data before cutover expensive in time. The goal is to get to a state where the delta between source and target is small enough to close in a short maintenance window.
The standard approach for minimal downtime is transactional replication or log shipping to the Azure target running in parallel with production, then a final cutover. Azure Database Migration Service supports online migration for SQL Server to Azure SQL Database and SQL Managed Instance, keeping the target in sync via continuous change tracking and only requiring the application to point at the new endpoint at cutover.
The pre-migration steps matter as much as the migration itself: assess the source for incompatibilities with the target service tier (unsupported features, linked servers, agent jobs, cross-database queries), size the Azure target appropriately, and run the migration in a staging environment first to validate the process and measure actual cutover time.
For the cutover window specifically: drain connections to the source, allow replication or log shipping to fully catch up, verify data consistency, update connection strings, and open the Azure endpoint. The actual downtime is the drain plus sync plus DNS propagation time, which for a well-prepared migration can be under 15 minutes regardless of database size because you are only moving the delta not all 10TB.
Understanding that the key is minimizing the delta at cutover, not avoiding a full copy. Mentioning Azure DMS for online migration, the importance of pre-migration assessment, and being specific about what happens during the cutover window itself.
Metadata quality has always mattered for developer productivity and maintainability. In 2026 it directly determines the quality of AI-generated SQL against your database, which makes it a first-class operational concern rather than a housekeeping task.
When an LLM generates SQL against your schema it reads your table names, column names, and extended property descriptions to understand what your data means. A table named tbl_OrdHdr with columns named c1, c4, and flg1 gives the LLM almost nothing to work with. It guesses. It generates plausible-looking SQL that may be completely wrong. It will not tell you it is guessing.
Good metadata in 2026 means: meaningful table and column names that communicate business intent, extended properties on every table exposed to AI tools describing what the table contains and how to use it, extended properties on status and type columns documenting the valid values and their business meanings, and documentation of business rules embedded in the data (soft deletes, tenant filters, required filters for correct results). This documentation is loaded into the AI knowledge base as context the LLM uses to generate correct SQL.
-- Extended property: document valid values for AI and human readers
EXEC sys.sp_addextendedproperty
@name = N'MS_Description',
@value = N'Order status. Values: 1=Draft, 2=Submitted, 3=Processing,
4=Shipped, 5=Delivered, 6=Cancelled. For revenue reporting
include only status 5 (Delivered).',
@level0type = N'SCHEMA', @level0name = N'dbo',
@level1type = N'TABLE', @level1name = N'OrderHeader',
@level2type = N'COLUMN', @level2name = N'OrderStatusID';
Connecting metadata quality to AI output quality, knowing that the LLM reads extended properties, and being able to give a concrete example of what good extended property documentation looks like. This is a question that separates 2024 thinking from 2026 thinking.
"Good column names and documentation." True but it misses the AI context entirely. The interviewer asking this question in 2026 wants to know you understand how metadata quality connects to AI system quality.
Final preparation tip: For each question, prepare a short version (30 seconds, the key point), a medium version (2 minutes, the full answer), and a production story if you have one. Interviewers who like your answer will ask follow-up questions. Interviewers who are short on time want the 30-second version first. Have both ready.
References
- SQLYARD: SQL Server Blocking vs Deadlocks
- SQLYARD: SQL Server on VMware and EC2 Socket and NUMA Guide
- SQLYARD: SQL Server Delayed Durability
- SQLYARD: SQL Server Fragmentation Complete Guide
- SQLYARD: SQL Server MERGE Guide
- SQLYARD: SQL Server Compression and Partitioning Guide
- SQLYARD: How AI Connects to Your Company Data
- SQLYARD: Your Column Names Now Determine How Smart Your AI Is
- SQLYARD: MCP and SQL Server
- SQLYARD: From Rows to Reasoning: Designing SQL Server Databases for AI Apps
Discover more from SQLYARD
Subscribe to get the latest posts sent to your email.


