SQL Server 2025: New Features You Actually Need To Care About

SQL Server 2025 (17.x) is the first release in a long time that feels like more than “2022 with patches.” It is aimed at AI-heavy, hybrid, and event driven workloads while still giving DBAs and developers practical wins around JSON, performance, security, and HA.

At the time of writing, SQL Server 2025 is still in preview, so features and syntax can change before GA. Always validate against the latest documentation before using anything in production. [1]

If you are coming from SQL Server 2019 or 2022, this is the first version that really expects you to:

Store and search vector embeddings for semantic and AI scenarios
Treat JSON as a first class type, not just nvarchar(max) with string hacks
Rely more on Intelligent Query Processing for plan quality and feedback
Use built in features for security, compression, and hybrid analytics instead of rolling your own

This post walks through the major new features you should actually care about and ends with a hands on workshop you can run in a lab.

1. Big picture: What is SQL Server 2025 trying to be

SQL Server 2025 is positioned as an AI ready, hybrid, event driven database engine. In practical terms that means:

Built in vector search and AI integration
Native JSON and JSON indexes
Better query performance with minimal code changes
Stronger security and hybrid / Fabric integration [1][2]

Official messaging focuses on:

Vector capabilities and external AI models
Native JSON storage and indexing
Change event streaming and better integration with cloud platforms
A big batch of Intelligent Query Processing (IQP) enhancements [1][3]

If you are used to bolting AI or semi structured workloads around SQL Server, 2025 is the first release that encourages you to bring a lot of that logic into the engine instead.

2. AI and vector features

2.1 Vector data type and vector search

What it is

SQL Server 2025 adds a VECTOR data type designed to store embeddings for semantic search and AI workloads. Vectors are stored in a compact binary format but exposed as JSON arrays for convenience. [3]

Example table:

CREATE TABLE dbo.Documents
(
    DocumentId     int IDENTITY(1,1) PRIMARY KEY,
    Title          nvarchar(200),
    Content        nvarchar(max),
    Embedding      VECTOR(1536) NOT NULL
);

You can insert embeddings as JSON arrays:

INSERT INTO dbo.Documents (Title, Content, Embedding)
VALUES
(
    N'Bike policy',
    N'Document that explains bike and scooter policy...',
    '[0.0123, -0.22, 0.88, ...]'  -- embedding from an external model
);

Vectors support float32 by default and optional float16 for more compact storage. The float16 option is gated behind PREVIEW_FEATURES. [3]

ALTER DATABASE SCOPED CONFIGURATION
SET PREVIEW_FEATURES = ON;
GO

ALTER TABLE dbo.Documents
ADD Embedding16 VECTOR(1536, float16) NULL;

On top of the data type, SQL Server 2025 adds:

VECTOR_DISTANCE, VECTOR_NORM, VECTOR_NORMALIZE, VECTORPROPERTY
Vector indexes (approximate nearest neighbor, DiskANN under the covers)
VECTOR_SEARCH to run similarity lookups efficiently [3][4]

Why it matters

You can build semantic search, recommendations, or RAG style apps without a separate vector database
Your AI data stays under the same security, backup, and HA story as your existing relational data
Drivers like Microsoft.Data.SqlClient and the TDS protocol are being updated to handle vector data efficiently [3]

Why not to overuse it

Embeddings are large. High dimensional vectors can bloat tables, backups, and log shipping if you are not careful
Similarity search is approximate. It is not a strict replacement for equality predicates when you need exact results
Dimension counts and base types are part of your schema. Changing them later is painful

Best practices

Keep embeddings in dedicated tables (for example dbo.DocumentEmbeddings) instead of mixing them into lots of business tables
Start with VECTOR(n) using float32 and move to float16 only after you understand the precision and quality trade offs
Use separate filegroups for large vector data and indexes so you can manage growth and IO more predictably

2.2 External AI models and AI T-SQL functions

SQL Server 2025 lets you register AI models as first class objects and call them from T SQL by exposing external REST endpoints. [4][5]

Key pieces:

CREATE EXTERNAL MODEL stores model metadata (location, provider, credentials)
AI_GENERATE_EMBEDDINGS creates embeddings from text
AI_GENERATE_CHUNKS splits content into chunks for RAG style scenarios

Example:

CREATE EXTERNAL MODEL MyEmbeddingsModel
WITH
(
    LOCATION   = 'https://my-openai-endpoint.azure.com',
    API_FORMAT = 'Azure OpenAI',
    MODEL_TYPE = EMBEDDINGS,
    MODEL      = 'text-embedding-3-large',
    CREDENTIAL = MyAiCredential
);
GO

SELECT AI_GENERATE_EMBEDDINGS(
           MODEL = 'MyEmbeddingsModel',
           INPUT_TEXT = Content
       ) AS Embedding
FROM dbo.Documents;

Why it matters

You can keep AI calls inside the database, close to the data
T SQL can orchestrate a full RAG pipeline: chunking, embeddings, vector storage, and vector search in the same place [5]

Why not to use it blindly

You still pay for the external AI endpoint. SQL Server is just fronting it
Calling external REST endpoints from a hot OLTP path adds latency, failure modes, and rate limiting concerns

Best practices

Use external models in offline or batch pipelines to pre compute embeddings rather than calling them for every user request
If you self host models (Ollama, ONNX, custom containers), size CPU or GPU and network correctly, and monitor latency and errors separately from SQL Server [5]

3. JSON gets serious: native JSON type and JSON index

3.1 Native JSON type

SQL Server 2025 adds a native JSON data type instead of treating JSON as plain nvarchar(max). It integrates with existing JSON functions and gives you better storage and indexing options. [6]

Example:

CREATE TABLE Sales.Orders
(
    OrderId    int IDENTITY(1,1) PRIMARY KEY,
    CustomerId int NOT NULL,
    OrderData  JSON NOT NULL  -- native type
);

You still use JSON_VALUE, JSON_QUERY, JSON_MODIFY, and OPENJSON against this column.

3.2 JSON index

On top of that, SQL Server 2025 introduces a JSON index on JSON columns that accelerates common JSON queries without forcing you to create a pile of computed columns. [7][8]

Basic syntax (simplified):

CREATE JSON INDEX IX_Orders_OrderData
ON Sales.Orders(OrderData)
FOR ('$.Customer.Name', '$.Items[*].ProductId');

JSON indexes support JSON functions like JSON_VALUE, JSON_PATH_EXISTS, and JSON_CONTAINS.

Why it matters

You can treat JSON as a first class citizen without giving up query performance
You avoid “computed column hell” for every path you want to query
It matches what modern APIs and message payloads usually look like

Why not to replace everything with JSON

JSON is still semi structured. Classic relational modeling is easier to enforce and reason about for core entities
JSON indexes have limitations. For example, you need a clustered primary key, one JSON index per JSON column, and there are restrictions with views and memory optimized tables [8]
Deep and heavily nested JSON can still be expensive to query and maintain

Best practices

Use JSON for:
- Flexible attributes
- Integration payloads
- Things that change shape often
Keep core entities relational and use JSON as an extension, not a default
Index only the paths you actually query. Each additional path has a space and maintenance cost

4. Developer productivity: regex, fuzzy matching, REST, and more

SQL Server 2025 brings a nice set of developer centric T SQL features that a lot of people have wanted for years. [2][3]

4.1 Regular expressions

You finally get native regex support:

REGEXP_LIKE
REGEXP_REPLACE
REGEXP_SUBSTR
REGEXP_INSTR
REGEXP_COUNT
REGEXP_MATCHES
REGEXP_SPLIT_TO_TABLE [2]

Example:

SELECT *
FROM dbo.Customers
WHERE REGEXP_LIKE(EmailAddress, '^[A-Za-z0-9._%+-]+@example\.com$');

Splitting a string into rows:

SELECT value
FROM REGEXP_SPLIT_TO_TABLE('red, green;blue', '[,;]\s*');

4.2 Fuzzy string matching

New built in functions remove the need for custom CLR or awkward T SQL hacks:

EDIT_DISTANCE
EDIT_DISTANCE_SIMILARITY
JARO_WINKLER_DISTANCE
JARO_WINKLER_SIMILARITY [2]

Example:

SELECT TOP (10)
       c1.Name,
       c2.Name,
       EDIT_DISTANCE_SIMILARITY(c1.Name, c2.Name) AS Similarity
FROM dbo.Customers c1
JOIN dbo.Customers c2
  ON c1.CustomerId < c2.CustomerId
WHERE EDIT_DISTANCE_SIMILARITY(c1.Name, c2.Name) >= 90;

4.3 REST endpoint invocation

sys.sp_invoke_external_rest_endpoint lets SQL Server call REST or GraphQL endpoints directly: [3]

Trigger Azure Functions
Update Power BI
Call Azure OpenAI or internal APIs

Example:

EXEC sys.sp_invoke_external_rest_endpoint
    @method  = 'POST',
    @url     = N'https://example.com/api/notify',
    @headers = N'{"Content-Type": "application/json"}',
    @payload = JSON_OBJECT('orderId': 123, 'status': 'Shipped');

Why this is useful

Eliminates simple “middle tier glue” for small integration tasks
Lets you prototype modern event driven flows in T SQL

Why not to overdo it

SQL Server is not an ESB. Heavy fan out or complex workflows still belong in app code or dedicated services
Network latency and external failures can slow or break queries if you call REST endpoints in hot paths

5. Query performance: new Intelligent Query Processing (IQP) tricks

SQL Server 2025 continues the Intelligent Query Processing story and adds several interesting improvements. [1][9]

Key additions:

Optional Parameter Plan Optimization (OPPO)
- Targets the common pattern WHERE Column = @p OR @p IS NULL
- Can choose different plans based on whether a parameter is NULL or not [9]
Cardinality estimation feedback for expressions
- Learns from repeated expressions and adjusts CE models over time [9]
Optimized sp_executesql
- Reduces compilation storms from dynamic SQL [9]
DOP feedback on by default, Query Store support for readable secondaries, and an ABORT_QUERY_EXECUTION hint to block obviously bad queries [1][9]

Example of a pattern OPPO can help with:

CREATE PROCEDURE Sales.GetOrders
    @CustomerId int = NULL
AS
BEGIN
    SELECT *
    FROM Sales.Orders
    WHERE (@CustomerId IS NULL OR CustomerId = @CustomerId);
END;

Historically, this can create a single cached plan that works poorly for some parameter combinations. OPPO tries to reduce that problem by handling the NULL and non NULL cases more intelligently.

Best practices

Set database compatibility level to 170 for databases where you want the SQL Server 2025 IQP features
Keep Query Store enabled and in READ_WRITE mode so feedback features can store and reuse what they learn [9]
Still fix obviously bad queries and indexes. IQP is a safety net, not a replacement for tuning

6. Engine, HA, and backup improvements that matter to DBAs

Beyond AI and JSON, SQL Server 2025 has several practical wins that working DBAs will care about. [1][8][10]

Highlights:

Optimized locking
- TID locking and lock after qualification reduce blocking and lock memory usage
Tempdb space governance and Accelerated Database Recovery in tempdb
- Helps prevent runaway tempdb usage from taking down the instance
Persisted statistics on readable secondaries
- Stats survive failover and support read only workloads better
ZSTD backup compression
- Faster and more effective backup compression, especially for large databases [10]
Backups to immutable blob storage
- Better ransomware story and long term backup integrity
Always On improvements
- Async page requests, commit wait tuning, TLS 1.3 with TDS 8.0, and better routing controls [1][10]

Why it matters

You get better concurrency and fewer tempdb incidents without changing application code
HA and DR setups become more resilient and more secure out of the box
Backup performance and storage costs improve without third party tools

Best practices

Test optimized locking under your workload. Locking behavior changes can have edge cases
Put tempdb governance in place in lower environments first so developers see problems early
Move backup targets to immutable storage where possible to improve your ransomware recovery posture

7. Security and hybrid integration

Security and hybrid stories continue to evolve in SQL Server 2025. [1][8][10]

Key changes:

Password hashing now uses PBKDF2 by default with NIST aligned settings
More fine grained security cache invalidation
TLS 1.3 with TDS 8.0 in more paths (Always On, log shipping, tools, PolyBase)
Support for Entra based managed identity for outbound connections, Azure Key Vault, and blob backups when Arc enabled

On the hybrid and analytics side:

SQL Server 2025 can mirror databases into Microsoft Fabric for near real time analytics with a “zero ETL” style workflow [8][10]

Why it matters

Better default security posture and fewer secrets to manage manually
Cleaner story for “operational data here, analytics over there” without building a lot of ETL plumbing yourself

8. How to adopt SQL Server 2025 safely

8.1 What to try early

If you are already on 2019 or 2022, these are relatively low risk, high value areas to explore first:

IQP enhancements (OPPO, CE feedback for expressions, optimized sp_executesql)
Persisted stats and Query Store on readable secondaries
ZSTD backup compression and backups to immutable blob
Regex and fuzzy matching for data quality, deduping, and string heavy logic

If you are exploring AI:

VECTOR type and basic vector search on a small, non critical workload
External models and AI_* functions in a lab or proof of concept

8.2 When to hold back

Do not move mission critical OLTP workloads to preview builds without a clear rollback plan
Avoid hard coupling to features gated behind PREVIEW_FEATURES = ON until GA cements syntax and behavior
Be cautious before putting vector search or REST calls into latency sensitive code paths until you fully understand performance and failure modes

9. Step by step workshop: exploring key SQL Server 2025 features

Here is a hands on lab you can run on a dev box or VM with SQL Server 2025 Preview. The goal is to touch the big areas: JSON, vectors, AI integration, IQP, and backups.

Prerequisites

SQL Server 2025 Preview (RC1 or later) installed [1]
SSMS 21 or Azure Data Studio updated for TDS 8.0 and new syntax [2]
A database called Sandbox2025 (create it if needed)

Step 1 – Enable preview features where needed

USE Sandbox2025;
GO

ALTER DATABASE SCOPED CONFIGURATION
SET PREVIEW_FEATURES = ON;
GO

Why

Some features such as float16 vectors and certain AI capabilities are gated behind PREVIEW_FEATURES. [3]

Why not always on in production

Preview features can change or even be removed. You do not want that surprise in a production environment.

Step 2 – Create a JSON backed table and JSON index

DROP TABLE IF EXISTS Sales.Orders;
GO

CREATE SCHEMA Sales AUTHORIZATION dbo;
GO

CREATE TABLE Sales.Orders
(
    OrderId     int IDENTITY(1,1) PRIMARY KEY,
    CustomerId  int NOT NULL,
    OrderJson   JSON NOT NULL
);
GO

INSERT INTO Sales.Orders (CustomerId, OrderJson)
VALUES
(1, N'{"OrderNumber":"SO-001","Total":120.50,"Status":"Completed"}'),
(2, N'{"OrderNumber":"SO-002","Total":87.99,"Status":"Pending"}'),
(1, N'{"OrderNumber":"SO-003","Total":340.10,"Status":"Completed"}');
GO

CREATE JSON INDEX IX_Orders_OrderJson
ON Sales.Orders(OrderJson)
FOR ('$.OrderNumber', '$.Status');
GO

Query it:

SELECT OrderId,
       JSON_VALUE(OrderJson, '$.OrderNumber') AS OrderNumber,
       JSON_VALUE(OrderJson, '$.Status')      AS Status
FROM Sales.Orders
WHERE JSON_VALUE(OrderJson, '$.Status') = 'Completed';

Why

You see how the JSON type plus JSON index behaves as a semi structured extension to a relational table. [7][8]

Why not

Check the execution plan. JSON indexes still have a cost. In some workloads, classic normalization and relational indexes will perform better.

Step 3 – Play with regex and fuzzy matching

DROP TABLE IF EXISTS dbo.Contacts;
GO

CREATE TABLE dbo.Contacts
(
    ContactId   int IDENTITY(1,1) PRIMARY KEY,
    FullName    nvarchar(200),
    Email       nvarchar(320)
);

INSERT INTO dbo.Contacts (FullName, Email)
VALUES
(N'Jane Smith',  N'jane.smith@example.com'),
(N'Jane Smyth',  N'jane.smyth@example.com'),
(N'John Smith',  N'john.smith@other.org'),
(N'Invalid User',N'bad-email');
GO

-- Find invalid emails using regex
SELECT *
FROM dbo.Contacts
WHERE NOT REGEXP_LIKE(
    Email,
    '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'
);

-- Find similar names using fuzzy matching
SELECT c1.FullName, c2.FullName,
       EDIT_DISTANCE_SIMILARITY(c1.FullName, c2.FullName) AS Similarity
FROM dbo.Contacts c1
JOIN dbo.Contacts c2
  ON c1.ContactId < c2.ContactId
WHERE EDIT_DISTANCE_SIMILARITY(c1.FullName, c2.FullName) >= 85;

Why

This replaces a lot of custom CLR or app side matching you may have been doing. [2]

Why not

Do not run expensive fuzzy checks on very large tables inside your OLTP path without proper indexing, batching, and testing.

Step 4 – Create a vector table and store embeddings

For this step, assume you are getting embeddings from an external process (Python, Azure Function, background job).

DROP TABLE IF EXISTS dbo.HelpArticles;
GO

CREATE TABLE dbo.HelpArticles
(
    ArticleId   int IDENTITY(1,1) PRIMARY KEY,
    Title       nvarchar(200),
    Body        nvarchar(max),
    Embedding   VECTOR(1536) NOT NULL
);
GO

INSERT INTO dbo.HelpArticles (Title, Body, Embedding)
VALUES
(
    N'How to reset your password',
    N'If you forgot your password, click Forgot Password on the sign in page...',
    '[0.001, -0.23, 0.98, ... ]'
),
(
    N'How to change your email address',
    N'You can update your email from the profile settings page...',
    '[0.004, -0.19, 0.87, ... ]'
);

Replace the ellipsis with real embeddings from your model.

Later, you can create a vector index and use VECTOR_SEARCH to find nearest neighbors.

Why

You get a feel for schema design and storage footprint of vector data in your own environment. [3][4]

Why not

Until you have a real embedding pipeline and model strategy, treat this as a lab only. Do not start storing fake vectors in production.

Step 5 – Simulate OPPO and IQP behavior

Create a classic “optional filter” pattern:

DROP TABLE IF EXISTS Sales.OrderLines;
GO

CREATE TABLE Sales.OrderLines
(
    OrderLineId  int IDENTITY(1,1) PRIMARY KEY,
    ProductId    int NOT NULL,
    Quantity     int NOT NULL,
    Price        money NOT NULL,
    CreatedDate  datetime2 NOT NULL DEFAULT sysutcdatetime()
);

-- Seed sample data
INSERT INTO Sales.OrderLines (ProductId, Quantity, Price)
SELECT TOP (50000)
       ABS(CHECKSUM(NEWID())) % 5000 + 1,
       1 + ABS(CHECKSUM(NEWID())) % 5,
       10.00 + ABS(CHECKSUM(NEWID())) % 100
FROM sys.all_objects a CROSS JOIN sys.all_objects b;
GO

CREATE OR ALTER PROCEDURE Sales.GetOrderLines
    @ProductId int = NULL
AS
BEGIN
    SELECT *
    FROM Sales.OrderLines
    WHERE (@ProductId IS NULL OR ProductId = @ProductId);
END;
GO

Now:

Set database compatibility level to 170 for Sandbox2025
Call the procedure many times with a mix of NULL and non NULL @ProductId values
Inspect Query Store and plans to see how OPPO behaves

Why

You see how SQL Server 2025 tries to fix the “one bad plan for everyone” parameter pattern. [9]

Why not

You still want to refactor obviously problematic procedures. Sometimes splitting procedures or using different patterns is cleaner than relying entirely on OPPO.

Step 6 – Test ZSTD backup compression

BACKUP DATABASE Sandbox2025
TO DISK = 'C:\Backups\Sandbox2025_zstd.bak'
WITH COMPRESSION,
     COMPRESSION_ALGORITHM = 'ZSTD';

Compare file size and backup time against:

COMPRESSION_ALGORITHM = 'MS_XPRESS'
A backup without compression

Why

ZSTD often gives better compression ratios with solid throughput, which matters a lot for large databases and constrained storage. [10]

Why not

On very CPU bound instances, backup compression can be a noticeable CPU consumer. Test before you standardize and monitor CPU impact during backups.

10. Final thoughts

SQL Server 2025 is the first release in a while that is hard to ignore if you care about:

AI and semantic search directly on your operational data
JSON heavy or event driven applications
Squeezing more throughput and stability out of existing workloads without rewriting everything

For DBAs and developers, a sensible approach is:

Adopt conservatively. Start with dev and test, then greenfield workloads, then selective migrations
Focus on “no code change” wins first. Take advantage of IQP, backup, HA, and security improvements
Treat AI and vector features as a separate track. Bring in app teams, security, and architecture early and plan around costs

Used well, SQL Server 2025 is a strong step toward an AI ready, hybrid data platform. Used carelessly, it is just another preview build with sharp edges. The difference is in how thoughtfully you choose your first experiments.

References

[1] Microsoft Learn, “What is new in SQL Server 2025 (Preview)”
[2] Microsoft Learn, “T SQL language enhancements in SQL Server 2025”
[3] Microsoft Learn, “Vector data type and vector search in SQL Server 2025”
[4] Microsoft Learn, “AI integration and external models in SQL Server”
[5] Microsoft Learn, “AI functions for embeddings and chunking in SQL Server”
[6] Microsoft Learn, “Native JSON type in SQL Server 2025”
[7] Microsoft Tech Community, “Introducing JSON indexes in SQL Server 2025”
[8] MSSQLTips, “Working with JSON type and JSON indexes in SQL Server 2025”
[9] Microsoft Learn, “Intelligent Query Processing enhancements in SQL Server 2025”
[10] MSSQLTips, “Backup, compression, and HA improvements in SQL Server 2025”

Discover more from SQLYARD

Subscribe to get the latest posts sent to your email.

SQL Server 2025: New Features You Actually Need To Care About

1. Big picture: What is SQL Server 2025 trying to be

2. AI and vector features

2.1 Vector data type and vector search

2.2 External AI models and AI T-SQL functions

3. JSON gets serious: native JSON type and JSON index

3.1 Native JSON type

3.2 JSON index

4. Developer productivity: regex, fuzzy matching, REST, and more

4.1 Regular expressions

4.2 Fuzzy string matching

4.3 REST endpoint invocation

5. Query performance: new Intelligent Query Processing (IQP) tricks

6. Engine, HA, and backup improvements that matter to DBAs

7. Security and hybrid integration

8. How to adopt SQL Server 2025 safely

8.1 What to try early

8.2 When to hold back

9. Step by step workshop: exploring key SQL Server 2025 features

Prerequisites

Step 1 – Enable preview features where needed

Step 2 – Create a JSON backed table and JSON index

Step 3 – Play with regex and fuzzy matching

Step 4 – Create a vector table and store embeddings

Step 5 – Simulate OPPO and IQP behavior

Step 6 – Test ZSTD backup compression

10. Final thoughts

References

Like this:

Related

Discover more from SQLYARD

Leave a ReplyCancel reply

Sign up to receive email updates, fresh news and more!

1. Big picture: What is SQL Server 2025 trying to be

2. AI and vector features

2.1 Vector data type and vector search

2.2 External AI models and AI T-SQL functions

3. JSON gets serious: native JSON type and JSON index

3.1 Native JSON type

3.2 JSON index

4. Developer productivity: regex, fuzzy matching, REST, and more

4.1 Regular expressions

4.2 Fuzzy string matching

4.3 REST endpoint invocation

5. Query performance: new Intelligent Query Processing (IQP) tricks

6. Engine, HA, and backup improvements that matter to DBAs

7. Security and hybrid integration

8. How to adopt SQL Server 2025 safely

8.1 What to try early

8.2 When to hold back

9. Step by step workshop: exploring key SQL Server 2025 features

Prerequisites

Step 1 – Enable preview features where needed

Step 2 – Create a JSON backed table and JSON index

Step 3 – Play with regex and fuzzy matching

Step 4 – Create a vector table and store embeddings

Step 5 – Simulate OPPO and IQP behavior

Step 6 – Test ZSTD backup compression

10. Final thoughts

References

Share this:

Like this:

Related

Discover more from SQLYARD

Related Posts

Leave a ReplyCancel reply

Discover more from SQLYARD