Retrieval, Feedback, and Real Improvement
The Problem Nobody Solves
In Part 1, we built a controlled layer on top of SQL Server — read-only enforced, queries validated before execution, workload isolated. That gave us something most systems never address: a setup that is safe to run in production.
But safety alone is not enough. The real question is simple:
How does the system improve over time without retraining a model?
If every request generates SQL from scratch, you get inconsistent query patterns, repeated mistakes, unpredictable performance, and unnecessary load on SQL Server. That does not scale.
What SQL Server Does Not Do
SQL Server does not train your AI. It does not update model weights, does not learn from execution, and does not improve query generation. Even when queries succeed, nothing is learned by the system itself.
This is why most AI and database systems plateau quickly. The improvement has to come from the layer you build around SQL Server — not from SQL Server itself.
What actually improves the system: structured metadata, query history, approved query patterns, and controlled reuse. This is not machine learning — it is reuse of validated work.
The Missing Layer: Retrieval
Most implementations follow a simple pattern on every single request. Compare that to the correct approach:
Most Implementations
Question ↓ Generate SQL ↓ Execute ↓ Repeat every timeThe Correct Approach
Question ↓ Check existing patterns ↓ Reuse or generate ↓ Validate → Execute → LearnBefore generating SQL, check if the problem has already been solved. This is the core principle behind Retrieval-Augmented Generation — but in SQL it is more direct: reuse validated SQL instead of generating new queries. See Microsoft’s RAG architecture overview for broader context.
What You Need to Store
Query History
- User question
- Generated SQL
- Execution time
- Row count returned
- Success or failure
Approved Query Library
- Normalized question
- Validated SQL text
- Tables used
- Joins, aggregations, filters
- Tags: revenue, users, retention…
The Query History table is your audit layer — it captures everything that runs. The Approved Query Library is your production SQL playbook — it only contains queries that are validated, correct, performant, and safe.
This separation is important. History captures everything. The library contains only what has earned its place. See Microsoft’s observability guidance for how to think about logging and audit layers in production systems.
How Retrieval Works
Every request follows the same five-step path:
- Normalize the question — Strip filler phrases so that “monthly revenue CA” and “revenue in California by month” resolve to the same normalized form.
- Search for similar queries — Start with keywords and tags. Add vector similarity later as the library grows.
- Rank similarity — Score candidates based on how closely they match the normalized input.
- Decide: reuse or generate — High match score means reuse or adapt the existing SQL. Low match score means generate new SQL and add it to history.
- Always validate — No exceptions. Read-only enforcement, table restrictions, row limits, and execution safety checks must run regardless of the source.
Validation is covered in depth in the Microsoft SQL injection prevention guide.
End-to-End Example
A user asks: monthly revenue for California.
The system finds an existing query: monthly revenue by state.
Instead of generating a new query from scratch, the system reuses the existing joins and aggregations and applies a California filter. The result is a consistent query pattern, predictable performance, and reduced risk — with no generation overhead.
Why This Matters for SQL Server
SQL Server performance depends heavily on query shape — how joins, aggregations, and indexes are used. Reusing known-good queries avoids plan instability, excessive table scans, and unpredictable execution behavior.
This is not just about AI consistency. It is about protecting your SQL Server from poorly-shaped queries that erode performance over time. The SQL Server Query Processing Architecture Guide explains how query plans are cached and why shape matters.
Retrieval vs Generation
| Approach | Behavior | Risk Level |
|---|---|---|
| Generate every time | New SQL on every request | High — no consistency guarantee |
| Retrieval first | Reuse validated SQL patterns | Controlled — known-good queries |
Without retrieval, every query is new, performance varies, validation load increases, and trust drops. With retrieval, patterns stabilize, performance improves, and results become consistent.
Important: Retrieval does not replace validation. Even approved queries must still pass through validation, execution controls, and resource limits — because context changes, filters change, and data volumes change.
Your retrieval layer does not replace SQL Server’s own protections either. You still rely on read-only replicas, workload isolation, and execution limits. See Microsoft’s guidance on read-scale availability groups and Resource Governor.
What You Should Not Do
- Attempt to retrain models inside SQL Server
- Skip validation at any point in the pipeline
- Execute generated SQL directly without a safety check
- Assume the system improves automatically without a retrieval layer
The Learning Loop
The full cycle that drives improvement over time:
Key principle: The system improves by reusing validated knowledge, not by retraining the model. Every successful execution makes the next one faster and more reliable.
Closing
Part 1 gave you control. Part 2 gives you consistency. You now have safe execution, repeatable query patterns, and a system that improves over time.
References
Discover more from SQLYARD
Subscribe to get the latest posts sent to your email.


