SQL Server 2025 makes “AI next to the data” practical by adding native vector storage, vector indexing and search, and first class hooks to embedding endpoints like Azure OpenAI through CREATE EXTERNAL MODEL. This lets you build semantic search and Retrieval Augmented Generation (RAG) without standing up a separate vector database. [1] [2]

Below is an end to end walkthrough you can adapt for your own apps.

Prereqs (keep it simple)

You will need:

SQL Server 2025 (17.x) and SSMS (optionally SSMS 21 for Copilot features). [3] [4]
An embeddings endpoint:
- Azure OpenAI (recommended for production) using Managed Identity, or
- A local dev endpoint like Ollama. [5]
Basic familiarity with JSON, REST concepts, and T-SQL. (Matches the module’s prereqs.) [1]

Architecture you are building

Store documents (or rows of text) in a table.
Chunk that text for better retrieval using AI_GENERATE_CHUNKS. [6]
Generate embeddings into a VECTOR(n) column using AI_GENERATE_EMBEDDINGS backed by an EXTERNAL MODEL. [7] [5]
Create a vector index using CREATE VECTOR INDEX. [8]
Run semantic search:
- Exact scoring with VECTOR_DISTANCE (no index), or
- Approximate nearest neighbor search with VECTOR_SEARCH (uses the vector index). [9] [10]
For RAG, retrieve top chunks in SQL, then pass those chunks to your app framework (LangChain or Semantic Kernel) to call the chat model. [1]

Step 1: Create a demo database and tables

This example uses a simple “knowledge base” table. You can swap in product docs, tickets, SOPs, student curriculum notes, anything.

CREATE DATABASE AIDemo;
GO
USE AIDemo;
GO

CREATE TABLE dbo.Documents
(
    DocumentId      int IDENTITY(1,1) PRIMARY KEY,
    Title           nvarchar(200) NOT NULL,
    Body            nvarchar(max) NOT NULL
);

CREATE TABLE dbo.DocumentChunks
(
    ChunkId         bigint IDENTITY(1,1) PRIMARY KEY,
    DocumentId      int NOT NULL,
    ChunkText       nvarchar(max) NOT NULL,
    -- Use the embedding dimension your model returns (example: 1536, 1024, 768, etc.)
    Embedding       vector(1536) NULL,
    CONSTRAINT FK_DocumentChunks_Documents
        FOREIGN KEY (DocumentId) REFERENCES dbo.Documents(DocumentId)
);
GO

Tip: The vector datatype is native in SQL Server 2025 and stored efficiently while still being easy to work with. [2] [11]

Step 2: Load a little sample data

INSERT dbo.Documents (Title, Body)
VALUES
(N'League Rules', N'Teams register in advance. Match check in closes 30 minutes prior. Good sportsmanship is required.'),
(N'Coaching Notes', N'Focus on crosshair placement, movement discipline, and callouts. Review VODs weekly.'),
(N'Parent FAQ', N'Programs are coach led. Players improve teamwork, leadership, and communication skills.');
GO

Step 3: Chunk your text with AI_GENERATE_CHUNKS

Chunking improves retrieval quality because you search smaller, coherent passages instead of entire documents. SQL Server 2025 provides AI_GENERATE_CHUNKS as a table valued function. [6]

INSERT dbo.DocumentChunks (DocumentId, ChunkText)
SELECT
    d.DocumentId,
    c.chunk
FROM dbo.Documents AS d
CROSS APPLY AI_GENERATE_CHUNKS
(
    source      = d.Body,
    chunk_type  = N'FIXED',
    chunk_size  = 500,
    overlap     = 50
) AS c;
GO

Step 4: Register your embeddings model with CREATE EXTERNAL MODEL

This is the “secure integration” step. SQL Server 2025 can store an external model definition and then reference it from T-SQL AI functions. [5]

Option A (recommended): Azure OpenAI with Managed Identity

The docs note that SQL Server 2025 requires Azure Arc connection and primary managed identity enabled for this approach. [5]

Use the Microsoft Learn pattern from CREATE EXTERNAL MODEL for an embeddings model. [5]

Option B (easy dev): Local Ollama embeddings

This is straight from the docs example pattern (adjust URL and model name to your setup). [5]

-- Example dev setup with Ollama embedding endpoint
CREATE EXTERNAL MODEL MyOllamaEmbeddings
AUTHORIZATION dbo
WITH
(
    LOCATION    = 'https://localhost:11435/api/embed',
    API_FORMAT  = 'Ollama',
    MODEL_TYPE  = EMBEDDINGS,
    MODEL       = 'all-minilm'
);
GO

Step 5: Generate embeddings into your VECTOR column

Now you can generate embeddings directly in SQL with AI_GENERATE_EMBEDDINGS ... USE MODEL .... [7]

UPDATE c
    SET Embedding = AI_GENERATE_EMBEDDINGS(c.ChunkText USE MODEL MyOllamaEmbeddings)
FROM dbo.DocumentChunks AS c
WHERE c.Embedding IS NULL;
GO

If your endpoint needs parameters (like dimensions), the function supports passing JSON parameters as shown in the docs. [7]

Step 6: Create a vector index for fast ANN search

CREATE VECTOR INDEX builds an approximate nearest neighbor index so VECTOR_SEARCH can avoid scanning everything. The official docs show DiskANN and metric options like COSINE and DOT. [8]

CREATE VECTOR INDEX IX_DocumentChunks_Embedding
ON dbo.DocumentChunks (Embedding)
WITH (METRIC = 'COSINE', TYPE = 'DISKANN');
GO

To verify the index exists:

SELECT *
FROM sys.vector_indexes
WHERE object_id = OBJECT_ID('dbo.DocumentChunks');
GO

sys.vector_indexes is the catalog view for vector indexes. [12]

Step 7: Semantic search (exact vs indexed)

Exact scoring (no index): VECTOR_DISTANCE

VECTOR_DISTANCE is exact and does not use the vector index. It is great for validation, small datasets, or reranking a small candidate set. [9]

DECLARE @queryText nvarchar(max) = N'How do teams check in for matches?';

DECLARE @queryEmbedding vector(1536) =
    AI_GENERATE_EMBEDDINGS(@queryText USE MODEL MyOllamaEmbeddings);

SELECT TOP (5)
    c.DocumentId,
    c.ChunkText,
    VECTOR_DISTANCE('COSINE', c.Embedding, @queryEmbedding) AS Distance
FROM dbo.DocumentChunks AS c
ORDER BY Distance ASC;
GO

Fast ANN search (uses index): VECTOR_SEARCH

VECTOR_SEARCH is the function designed to use vector indexes for approximate nearest neighbor search. [10] [9]

DECLARE @queryText nvarchar(max) = N'How do teams check in for matches?';

DECLARE @queryEmbedding vector(1536) =
    AI_GENERATE_EMBEDDINGS(@queryText USE MODEL MyOllamaEmbeddings);

SELECT TOP (5)
    c.DocumentId,
    c.ChunkText,
    s.distance
FROM VECTOR_SEARCH
(
    TABLE       = dbo.DocumentChunks,
    COLUMN      = Embedding,
    QUERY       = @queryEmbedding,
    TOP_N       = 5
) AS s
JOIN dbo.DocumentChunks AS c
    ON c.ChunkId = s.key
ORDER BY s.distance ASC;
GO

The exact argument names can vary by doc version, so always align your call to the current VECTOR_SEARCH reference page. [10]

Step 8: Turn semantic search into RAG

RAG is just two steps:

Retrieve the best chunks (you already did that with VECTOR_SEARCH).
Provide those chunks as context to a chat model in your app layer (LangChain or Semantic Kernel), then ask your question. The Microsoft Learn module explicitly calls out building RAG apps using T-SQL plus frameworks like LangChain and Semantic Kernel. [1]

Pattern: a stored procedure that returns “context”

CREATE OR ALTER PROCEDURE dbo.GetRagContext
(
    @Question nvarchar(max),
    @TopN int = 5
)
AS
BEGIN
    SET NOCOUNT ON;

    DECLARE @q vector(1536) =
        AI_GENERATE_EMBEDDINGS(@Question USE MODEL MyOllamaEmbeddings);

    SELECT TOP (@TopN)
        c.DocumentId,
        c.ChunkText,
        s.distance
    FROM VECTOR_SEARCH
    (
        TABLE   = dbo.DocumentChunks,
        COLUMN  = Embedding,
        QUERY   = @q,
        TOP_N   = @TopN
    ) AS s
    JOIN dbo.DocumentChunks AS c
        ON c.ChunkId = s.key
    ORDER BY s.distance ASC;
END
GO

Then in your app (LangChain or Semantic Kernel)

High level flow:

Call dbo.GetRagContext with the user question.
Concatenate returned ChunkText into a context block.
Send a prompt like: “Answer using only this context…”.
Call Azure OpenAI chat completion (or your chosen model).

That keeps SQL Server focused on retrieval and your app focused on generation, which is typically the cleanest production split. [1]

Step 9: Build “intelligent apps” with streaming changes (Change Event Streaming)

If your knowledge base changes frequently, you can stream table changes out to event consumers (for example, to trigger re-embedding, refresh caches, or update downstream systems). SQL Server 2025 introduces Change Event Streaming (preview), and it’s configured to publish change events to Azure Event Hubs. [13] [14]

At a high level, Microsoft’s configuration steps are:

Create an Event Hubs namespace and event hub.
Enable CES on the database.
Create an event stream group that defines destination and credentials.
Add tables to the group. [14]

Step 10: Optional analytics path with Microsoft Fabric mirroring

If you want near real time analytics without building fragile ETL, Fabric Mirroring can continuously replicate SQL Server 2025 data into Fabric. It requires Azure Arc for SQL Server 2025, and Microsoft provides a tutorial with the required permissions and setup flow. [15] [16]

Step 11: Use Copilot in SSMS (optional but helpful)

Copilot in SSMS can help you draft and refactor T-SQL, and answer questions about your database inside SSMS. The overview and install steps are documented by Microsoft. [3] [4]

References

[1] Microsoft Learn training module: “Build AI powered solutions using SQL Server 2025”.
[2] Microsoft Learn: Vector data type (overview).
[3] Microsoft Learn: Copilot in SQL Server Management Studio overview.
[4] Microsoft Learn: Install Copilot in SSMS.
[5] Microsoft Learn: CREATE EXTERNAL MODEL (Transact-SQL) examples including Azure OpenAI (Managed Identity) and Ollama.
[6] Microsoft Learn: AI_GENERATE_CHUNKS (Transact-SQL).
[7] Microsoft Learn: AI_GENERATE_EMBEDDINGS (Transact-SQL) examples and patterns.
[8] Microsoft Learn: CREATE VECTOR INDEX (Transact-SQL).
[9] Microsoft Learn: VECTOR_DISTANCE (Transact-SQL) behavior and index note.
[10] Microsoft Learn: VECTOR_SEARCH (Transact-SQL) purpose and ANN behavior.
[11] Microsoft Learn: Vector search and vector indexes overview.
[12] Microsoft Learn: sys.vector_indexes catalog view.
[13] Microsoft Learn: Change event streaming overview (preview).
[14] Microsoft Learn: Configure change event streaming steps.
[15] Microsoft Learn: Fabric mirroring from SQL Server (requirements and notes).
[16] Microsoft Learn: Tutorial to configure Fabric Mirroring from SQL Server.

Discover more from SQLYARD

Subscribe to get the latest posts sent to your email.

Build AI Powered Solutions Using SQL Server 2025 (Step by Step)

Prereqs (keep it simple)

Architecture you are building

Step 1: Create a demo database and tables

Step 2: Load a little sample data

Step 3: Chunk your text with AI_GENERATE_CHUNKS

Step 4: Register your embeddings model with CREATE EXTERNAL MODEL

Option A (recommended): Azure OpenAI with Managed Identity

Option B (easy dev): Local Ollama embeddings

Step 5: Generate embeddings into your VECTOR column

Step 6: Create a vector index for fast ANN search

Step 7: Semantic search (exact vs indexed)

Exact scoring (no index): VECTOR_DISTANCE

Fast ANN search (uses index): VECTOR_SEARCH

Step 8: Turn semantic search into RAG

Pattern: a stored procedure that returns “context”

Then in your app (LangChain or Semantic Kernel)

Step 9: Build “intelligent apps” with streaming changes (Change Event Streaming)

Step 10: Optional analytics path with Microsoft Fabric mirroring

Step 11: Use Copilot in SSMS (optional but helpful)

References

Like this:

Related

Discover more from SQLYARD

Leave a ReplyCancel reply

Sign up to receive email updates, fresh news and more!

Prereqs (keep it simple)

Architecture you are building

Step 1: Create a demo database and tables

Step 2: Load a little sample data

Step 3: Chunk your text with AI_GENERATE_CHUNKS

Step 4: Register your embeddings model with CREATE EXTERNAL MODEL

Option A (recommended): Azure OpenAI with Managed Identity

Option B (easy dev): Local Ollama embeddings

Step 5: Generate embeddings into your VECTOR column

Step 6: Create a vector index for fast ANN search

Step 7: Semantic search (exact vs indexed)

Exact scoring (no index): VECTOR_DISTANCE

Fast ANN search (uses index): VECTOR_SEARCH

Step 8: Turn semantic search into RAG

Pattern: a stored procedure that returns “context”

Then in your app (LangChain or Semantic Kernel)

Step 9: Build “intelligent apps” with streaming changes (Change Event Streaming)

Step 10: Optional analytics path with Microsoft Fabric mirroring

Step 11: Use Copilot in SSMS (optional but helpful)

References

Share this:

Like this:

Related

Discover more from SQLYARD

Related Posts

Leave a ReplyCancel reply

Discover more from SQLYARD