SQL Server 2025 (17.x) is the first release in a long time that feels like more than “2022 with patches.” It is aimed at AI-heavy, hybrid, and event driven workloads while still giving DBAs and developers practical wins around JSON, performance, security, and HA.
At the time of writing, SQL Server 2025 is still in preview, so features and syntax can change before GA. Always validate against the latest documentation before using anything in production. [1]
If you are coming from SQL Server 2019 or 2022, this is the first version that really expects you to:
- Store and search vector embeddings for semantic and AI scenarios
- Treat JSON as a first class type, not just
nvarchar(max)with string hacks - Rely more on Intelligent Query Processing for plan quality and feedback
- Use built in features for security, compression, and hybrid analytics instead of rolling your own
This post walks through the major new features you should actually care about and ends with a hands on workshop you can run in a lab.
1. Big picture: What is SQL Server 2025 trying to be
SQL Server 2025 is positioned as an AI ready, hybrid, event driven database engine. In practical terms that means:
- Built in vector search and AI integration
- Native JSON and JSON indexes
- Better query performance with minimal code changes
- Stronger security and hybrid / Fabric integration [1][2]
Official messaging focuses on:
- Vector capabilities and external AI models
- Native JSON storage and indexing
- Change event streaming and better integration with cloud platforms
- A big batch of Intelligent Query Processing (IQP) enhancements [1][3]
If you are used to bolting AI or semi structured workloads around SQL Server, 2025 is the first release that encourages you to bring a lot of that logic into the engine instead.
2. AI and vector features
2.1 Vector data type and vector search
What it is
SQL Server 2025 adds a VECTOR data type designed to store embeddings for semantic search and AI workloads. Vectors are stored in a compact binary format but exposed as JSON arrays for convenience. [3]
Example table:
CREATE TABLE dbo.Documents
(
DocumentId int IDENTITY(1,1) PRIMARY KEY,
Title nvarchar(200),
Content nvarchar(max),
Embedding VECTOR(1536) NOT NULL
);
You can insert embeddings as JSON arrays:
INSERT INTO dbo.Documents (Title, Content, Embedding)
VALUES
(
N'Bike policy',
N'Document that explains bike and scooter policy...',
'[0.0123, -0.22, 0.88, ...]' -- embedding from an external model
);
Vectors support float32 by default and optional float16 for more compact storage. The float16 option is gated behind PREVIEW_FEATURES. [3]
ALTER DATABASE SCOPED CONFIGURATION
SET PREVIEW_FEATURES = ON;
GO
ALTER TABLE dbo.Documents
ADD Embedding16 VECTOR(1536, float16) NULL;
On top of the data type, SQL Server 2025 adds:
VECTOR_DISTANCE,VECTOR_NORM,VECTOR_NORMALIZE,VECTORPROPERTY- Vector indexes (approximate nearest neighbor, DiskANN under the covers)
VECTOR_SEARCHto run similarity lookups efficiently [3][4]
Why it matters
- You can build semantic search, recommendations, or RAG style apps without a separate vector database
- Your AI data stays under the same security, backup, and HA story as your existing relational data
- Drivers like
Microsoft.Data.SqlClientand the TDS protocol are being updated to handle vector data efficiently [3]
Why not to overuse it
- Embeddings are large. High dimensional vectors can bloat tables, backups, and log shipping if you are not careful
- Similarity search is approximate. It is not a strict replacement for equality predicates when you need exact results
- Dimension counts and base types are part of your schema. Changing them later is painful
Best practices
- Keep embeddings in dedicated tables (for example
dbo.DocumentEmbeddings) instead of mixing them into lots of business tables - Start with
VECTOR(n)usingfloat32and move tofloat16only after you understand the precision and quality trade offs - Use separate filegroups for large vector data and indexes so you can manage growth and IO more predictably
2.2 External AI models and AI T-SQL functions
SQL Server 2025 lets you register AI models as first class objects and call them from T SQL by exposing external REST endpoints. [4][5]
Key pieces:
CREATE EXTERNAL MODELstores model metadata (location, provider, credentials)AI_GENERATE_EMBEDDINGScreates embeddings from textAI_GENERATE_CHUNKSsplits content into chunks for RAG style scenarios
Example:
CREATE EXTERNAL MODEL MyEmbeddingsModel
WITH
(
LOCATION = 'https://my-openai-endpoint.azure.com',
API_FORMAT = 'Azure OpenAI',
MODEL_TYPE = EMBEDDINGS,
MODEL = 'text-embedding-3-large',
CREDENTIAL = MyAiCredential
);
GO
SELECT AI_GENERATE_EMBEDDINGS(
MODEL = 'MyEmbeddingsModel',
INPUT_TEXT = Content
) AS Embedding
FROM dbo.Documents;
Why it matters
- You can keep AI calls inside the database, close to the data
- T SQL can orchestrate a full RAG pipeline: chunking, embeddings, vector storage, and vector search in the same place [5]
Why not to use it blindly
- You still pay for the external AI endpoint. SQL Server is just fronting it
- Calling external REST endpoints from a hot OLTP path adds latency, failure modes, and rate limiting concerns
Best practices
- Use external models in offline or batch pipelines to pre compute embeddings rather than calling them for every user request
- If you self host models (Ollama, ONNX, custom containers), size CPU or GPU and network correctly, and monitor latency and errors separately from SQL Server [5]
3. JSON gets serious: native JSON type and JSON index
3.1 Native JSON type
SQL Server 2025 adds a native JSON data type instead of treating JSON as plain nvarchar(max). It integrates with existing JSON functions and gives you better storage and indexing options. [6]
Example:
CREATE TABLE Sales.Orders
(
OrderId int IDENTITY(1,1) PRIMARY KEY,
CustomerId int NOT NULL,
OrderData JSON NOT NULL -- native type
);
You still use JSON_VALUE, JSON_QUERY, JSON_MODIFY, and OPENJSON against this column.
3.2 JSON index
On top of that, SQL Server 2025 introduces a JSON index on JSON columns that accelerates common JSON queries without forcing you to create a pile of computed columns. [7][8]
Basic syntax (simplified):
CREATE JSON INDEX IX_Orders_OrderData
ON Sales.Orders(OrderData)
FOR ('$.Customer.Name', '$.Items[*].ProductId');
JSON indexes support JSON functions like JSON_VALUE, JSON_PATH_EXISTS, and JSON_CONTAINS.
Why it matters
- You can treat JSON as a first class citizen without giving up query performance
- You avoid “computed column hell” for every path you want to query
- It matches what modern APIs and message payloads usually look like
Why not to replace everything with JSON
- JSON is still semi structured. Classic relational modeling is easier to enforce and reason about for core entities
- JSON indexes have limitations. For example, you need a clustered primary key, one JSON index per JSON column, and there are restrictions with views and memory optimized tables [8]
- Deep and heavily nested JSON can still be expensive to query and maintain
Best practices
- Use JSON for:
- Flexible attributes
- Integration payloads
- Things that change shape often
- Keep core entities relational and use JSON as an extension, not a default
- Index only the paths you actually query. Each additional path has a space and maintenance cost
4. Developer productivity: regex, fuzzy matching, REST, and more
SQL Server 2025 brings a nice set of developer centric T SQL features that a lot of people have wanted for years. [2][3]
4.1 Regular expressions
You finally get native regex support:
REGEXP_LIKEREGEXP_REPLACEREGEXP_SUBSTRREGEXP_INSTRREGEXP_COUNTREGEXP_MATCHESREGEXP_SPLIT_TO_TABLE[2]
Example:
SELECT *
FROM dbo.Customers
WHERE REGEXP_LIKE(EmailAddress, '^[A-Za-z0-9._%+-]+@example\.com$');
Splitting a string into rows:
SELECT value
FROM REGEXP_SPLIT_TO_TABLE('red, green;blue', '[,;]\s*');
4.2 Fuzzy string matching
New built in functions remove the need for custom CLR or awkward T SQL hacks:
EDIT_DISTANCEEDIT_DISTANCE_SIMILARITYJARO_WINKLER_DISTANCEJARO_WINKLER_SIMILARITY[2]
Example:
SELECT TOP (10)
c1.Name,
c2.Name,
EDIT_DISTANCE_SIMILARITY(c1.Name, c2.Name) AS Similarity
FROM dbo.Customers c1
JOIN dbo.Customers c2
ON c1.CustomerId < c2.CustomerId
WHERE EDIT_DISTANCE_SIMILARITY(c1.Name, c2.Name) >= 90;
4.3 REST endpoint invocation
sys.sp_invoke_external_rest_endpoint lets SQL Server call REST or GraphQL endpoints directly: [3]
- Trigger Azure Functions
- Update Power BI
- Call Azure OpenAI or internal APIs
Example:
EXEC sys.sp_invoke_external_rest_endpoint
@method = 'POST',
@url = N'https://example.com/api/notify',
@headers = N'{"Content-Type": "application/json"}',
@payload = JSON_OBJECT('orderId': 123, 'status': 'Shipped');
Why this is useful
- Eliminates simple “middle tier glue” for small integration tasks
- Lets you prototype modern event driven flows in T SQL
Why not to overdo it
- SQL Server is not an ESB. Heavy fan out or complex workflows still belong in app code or dedicated services
- Network latency and external failures can slow or break queries if you call REST endpoints in hot paths
5. Query performance: new Intelligent Query Processing (IQP) tricks
SQL Server 2025 continues the Intelligent Query Processing story and adds several interesting improvements. [1][9]
Key additions:
- Optional Parameter Plan Optimization (OPPO)
- Targets the common pattern
WHERE Column = @p OR @p IS NULL - Can choose different plans based on whether a parameter is NULL or not [9]
- Targets the common pattern
- Cardinality estimation feedback for expressions
- Learns from repeated expressions and adjusts CE models over time [9]
- Optimized
sp_executesql- Reduces compilation storms from dynamic SQL [9]
- DOP feedback on by default, Query Store support for readable secondaries, and an
ABORT_QUERY_EXECUTIONhint to block obviously bad queries [1][9]
Example of a pattern OPPO can help with:
CREATE PROCEDURE Sales.GetOrders
@CustomerId int = NULL
AS
BEGIN
SELECT *
FROM Sales.Orders
WHERE (@CustomerId IS NULL OR CustomerId = @CustomerId);
END;
Historically, this can create a single cached plan that works poorly for some parameter combinations. OPPO tries to reduce that problem by handling the NULL and non NULL cases more intelligently.
Best practices
- Set database compatibility level to 170 for databases where you want the SQL Server 2025 IQP features
- Keep Query Store enabled and in
READ_WRITEmode so feedback features can store and reuse what they learn [9] - Still fix obviously bad queries and indexes. IQP is a safety net, not a replacement for tuning
6. Engine, HA, and backup improvements that matter to DBAs
Beyond AI and JSON, SQL Server 2025 has several practical wins that working DBAs will care about. [1][8][10]
Highlights:
- Optimized locking
- TID locking and lock after qualification reduce blocking and lock memory usage
- Tempdb space governance and Accelerated Database Recovery in tempdb
- Helps prevent runaway tempdb usage from taking down the instance
- Persisted statistics on readable secondaries
- Stats survive failover and support read only workloads better
- ZSTD backup compression
- Faster and more effective backup compression, especially for large databases [10]
- Backups to immutable blob storage
- Better ransomware story and long term backup integrity
- Always On improvements
- Async page requests, commit wait tuning, TLS 1.3 with TDS 8.0, and better routing controls [1][10]
Why it matters
- You get better concurrency and fewer tempdb incidents without changing application code
- HA and DR setups become more resilient and more secure out of the box
- Backup performance and storage costs improve without third party tools
Best practices
- Test optimized locking under your workload. Locking behavior changes can have edge cases
- Put tempdb governance in place in lower environments first so developers see problems early
- Move backup targets to immutable storage where possible to improve your ransomware recovery posture
7. Security and hybrid integration
Security and hybrid stories continue to evolve in SQL Server 2025. [1][8][10]
Key changes:
- Password hashing now uses PBKDF2 by default with NIST aligned settings
- More fine grained security cache invalidation
- TLS 1.3 with TDS 8.0 in more paths (Always On, log shipping, tools, PolyBase)
- Support for Entra based managed identity for outbound connections, Azure Key Vault, and blob backups when Arc enabled
On the hybrid and analytics side:
- SQL Server 2025 can mirror databases into Microsoft Fabric for near real time analytics with a “zero ETL” style workflow [8][10]
Why it matters
- Better default security posture and fewer secrets to manage manually
- Cleaner story for “operational data here, analytics over there” without building a lot of ETL plumbing yourself
8. How to adopt SQL Server 2025 safely
8.1 What to try early
If you are already on 2019 or 2022, these are relatively low risk, high value areas to explore first:
- IQP enhancements (OPPO, CE feedback for expressions, optimized
sp_executesql) - Persisted stats and Query Store on readable secondaries
- ZSTD backup compression and backups to immutable blob
- Regex and fuzzy matching for data quality, deduping, and string heavy logic
If you are exploring AI:
VECTORtype and basic vector search on a small, non critical workload- External models and
AI_*functions in a lab or proof of concept
8.2 When to hold back
- Do not move mission critical OLTP workloads to preview builds without a clear rollback plan
- Avoid hard coupling to features gated behind
PREVIEW_FEATURES = ONuntil GA cements syntax and behavior - Be cautious before putting vector search or REST calls into latency sensitive code paths until you fully understand performance and failure modes
9. Step by step workshop: exploring key SQL Server 2025 features
Here is a hands on lab you can run on a dev box or VM with SQL Server 2025 Preview. The goal is to touch the big areas: JSON, vectors, AI integration, IQP, and backups.
Prerequisites
- SQL Server 2025 Preview (RC1 or later) installed [1]
- SSMS 21 or Azure Data Studio updated for TDS 8.0 and new syntax [2]
- A database called
Sandbox2025(create it if needed)
Step 1 – Enable preview features where needed
USE Sandbox2025;
GO
ALTER DATABASE SCOPED CONFIGURATION
SET PREVIEW_FEATURES = ON;
GO
Why
Some features such as float16 vectors and certain AI capabilities are gated behind PREVIEW_FEATURES. [3]
Why not always on in production
Preview features can change or even be removed. You do not want that surprise in a production environment.
Step 2 – Create a JSON backed table and JSON index
DROP TABLE IF EXISTS Sales.Orders;
GO
CREATE SCHEMA Sales AUTHORIZATION dbo;
GO
CREATE TABLE Sales.Orders
(
OrderId int IDENTITY(1,1) PRIMARY KEY,
CustomerId int NOT NULL,
OrderJson JSON NOT NULL
);
GO
INSERT INTO Sales.Orders (CustomerId, OrderJson)
VALUES
(1, N'{"OrderNumber":"SO-001","Total":120.50,"Status":"Completed"}'),
(2, N'{"OrderNumber":"SO-002","Total":87.99,"Status":"Pending"}'),
(1, N'{"OrderNumber":"SO-003","Total":340.10,"Status":"Completed"}');
GO
CREATE JSON INDEX IX_Orders_OrderJson
ON Sales.Orders(OrderJson)
FOR ('$.OrderNumber', '$.Status');
GO
Query it:
SELECT OrderId,
JSON_VALUE(OrderJson, '$.OrderNumber') AS OrderNumber,
JSON_VALUE(OrderJson, '$.Status') AS Status
FROM Sales.Orders
WHERE JSON_VALUE(OrderJson, '$.Status') = 'Completed';
Why
You see how the JSON type plus JSON index behaves as a semi structured extension to a relational table. [7][8]
Why not
Check the execution plan. JSON indexes still have a cost. In some workloads, classic normalization and relational indexes will perform better.
Step 3 – Play with regex and fuzzy matching
DROP TABLE IF EXISTS dbo.Contacts;
GO
CREATE TABLE dbo.Contacts
(
ContactId int IDENTITY(1,1) PRIMARY KEY,
FullName nvarchar(200),
Email nvarchar(320)
);
INSERT INTO dbo.Contacts (FullName, Email)
VALUES
(N'Jane Smith', N'jane.smith@example.com'),
(N'Jane Smyth', N'jane.smyth@example.com'),
(N'John Smith', N'john.smith@other.org'),
(N'Invalid User',N'bad-email');
GO
-- Find invalid emails using regex
SELECT *
FROM dbo.Contacts
WHERE NOT REGEXP_LIKE(
Email,
'^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'
);
-- Find similar names using fuzzy matching
SELECT c1.FullName, c2.FullName,
EDIT_DISTANCE_SIMILARITY(c1.FullName, c2.FullName) AS Similarity
FROM dbo.Contacts c1
JOIN dbo.Contacts c2
ON c1.ContactId < c2.ContactId
WHERE EDIT_DISTANCE_SIMILARITY(c1.FullName, c2.FullName) >= 85;
Why
This replaces a lot of custom CLR or app side matching you may have been doing. [2]
Why not
Do not run expensive fuzzy checks on very large tables inside your OLTP path without proper indexing, batching, and testing.
Step 4 – Create a vector table and store embeddings
For this step, assume you are getting embeddings from an external process (Python, Azure Function, background job).
DROP TABLE IF EXISTS dbo.HelpArticles;
GO
CREATE TABLE dbo.HelpArticles
(
ArticleId int IDENTITY(1,1) PRIMARY KEY,
Title nvarchar(200),
Body nvarchar(max),
Embedding VECTOR(1536) NOT NULL
);
GO
INSERT INTO dbo.HelpArticles (Title, Body, Embedding)
VALUES
(
N'How to reset your password',
N'If you forgot your password, click Forgot Password on the sign in page...',
'[0.001, -0.23, 0.98, ... ]'
),
(
N'How to change your email address',
N'You can update your email from the profile settings page...',
'[0.004, -0.19, 0.87, ... ]'
);
Replace the ellipsis with real embeddings from your model.
Later, you can create a vector index and use VECTOR_SEARCH to find nearest neighbors.
Why
You get a feel for schema design and storage footprint of vector data in your own environment. [3][4]
Why not
Until you have a real embedding pipeline and model strategy, treat this as a lab only. Do not start storing fake vectors in production.
Step 5 – Simulate OPPO and IQP behavior
Create a classic “optional filter” pattern:
DROP TABLE IF EXISTS Sales.OrderLines;
GO
CREATE TABLE Sales.OrderLines
(
OrderLineId int IDENTITY(1,1) PRIMARY KEY,
ProductId int NOT NULL,
Quantity int NOT NULL,
Price money NOT NULL,
CreatedDate datetime2 NOT NULL DEFAULT sysutcdatetime()
);
-- Seed sample data
INSERT INTO Sales.OrderLines (ProductId, Quantity, Price)
SELECT TOP (50000)
ABS(CHECKSUM(NEWID())) % 5000 + 1,
1 + ABS(CHECKSUM(NEWID())) % 5,
10.00 + ABS(CHECKSUM(NEWID())) % 100
FROM sys.all_objects a CROSS JOIN sys.all_objects b;
GO
CREATE OR ALTER PROCEDURE Sales.GetOrderLines
@ProductId int = NULL
AS
BEGIN
SELECT *
FROM Sales.OrderLines
WHERE (@ProductId IS NULL OR ProductId = @ProductId);
END;
GO
Now:
- Set database compatibility level to 170 for
Sandbox2025 - Call the procedure many times with a mix of NULL and non NULL
@ProductIdvalues - Inspect Query Store and plans to see how OPPO behaves
Why
You see how SQL Server 2025 tries to fix the “one bad plan for everyone” parameter pattern. [9]
Why not
You still want to refactor obviously problematic procedures. Sometimes splitting procedures or using different patterns is cleaner than relying entirely on OPPO.
Step 6 – Test ZSTD backup compression
BACKUP DATABASE Sandbox2025
TO DISK = 'C:\Backups\Sandbox2025_zstd.bak'
WITH COMPRESSION,
COMPRESSION_ALGORITHM = 'ZSTD';
Compare file size and backup time against:
COMPRESSION_ALGORITHM = 'MS_XPRESS'- A backup without compression
Why
ZSTD often gives better compression ratios with solid throughput, which matters a lot for large databases and constrained storage. [10]
Why not
On very CPU bound instances, backup compression can be a noticeable CPU consumer. Test before you standardize and monitor CPU impact during backups.
10. Final thoughts
SQL Server 2025 is the first release in a while that is hard to ignore if you care about:
- AI and semantic search directly on your operational data
- JSON heavy or event driven applications
- Squeezing more throughput and stability out of existing workloads without rewriting everything
For DBAs and developers, a sensible approach is:
- Adopt conservatively. Start with dev and test, then greenfield workloads, then selective migrations
- Focus on “no code change” wins first. Take advantage of IQP, backup, HA, and security improvements
- Treat AI and vector features as a separate track. Bring in app teams, security, and architecture early and plan around costs
Used well, SQL Server 2025 is a strong step toward an AI ready, hybrid data platform. Used carelessly, it is just another preview build with sharp edges. The difference is in how thoughtfully you choose your first experiments.
References
[1] Microsoft Learn, “What is new in SQL Server 2025 (Preview)”
[2] Microsoft Learn, “T SQL language enhancements in SQL Server 2025”
[3] Microsoft Learn, “Vector data type and vector search in SQL Server 2025”
[4] Microsoft Learn, “AI integration and external models in SQL Server”
[5] Microsoft Learn, “AI functions for embeddings and chunking in SQL Server”
[6] Microsoft Learn, “Native JSON type in SQL Server 2025”
[7] Microsoft Tech Community, “Introducing JSON indexes in SQL Server 2025”
[8] MSSQLTips, “Working with JSON type and JSON indexes in SQL Server 2025”
[9] Microsoft Learn, “Intelligent Query Processing enhancements in SQL Server 2025”
[10] MSSQLTips, “Backup, compression, and HA improvements in SQL Server 2025”
Discover more from SQLYARD
Subscribe to get the latest posts sent to your email.


