SQL Server 2025: REGEXP_INSTR() and REGEXP_COUNT() (Part 8)

Introduction

Continuing SQL Server 2025’s expansion of native regular expression support, today we cover two functions that focus on location and frequency:

REGEXP_INSTR() – returns the starting or ending position of a matched substring
REGEXP_COUNT() – returns the number of times a pattern occurs in a string

Together, these functions let you analyze text patterns with precision, enabling validation, auditing, ETL transformations, log parsing, pattern frequency analysis, and detailed input diagnostics.

These capabilities align SQL Server with other major RDBMS engines and eliminate the need for CLR helpers or external scripting languages.


REGEXP_INSTR(): Finding Match Positions

Syntax

REGEXP_INSTR(expression, pattern 
             [, position [, occurrence [, return_option [, flags]]]])

Parameter Breakdown

expression – text to evaluate
pattern – regular expression
position – starting character (default 1)
occurrence – which match to return (default 1)
return_option
0 – return the start position
1 – return the position of the character after the match
flags – optional RegEx flags (i, m, etc.)


Basic Example

Find the starting position of the first numeric sequence:

SELECT REGEXP_INSTR('Invoice 4529 approved', '\d+') AS Position;

Result:

9

Find the Ending Position

Set return_option to 1:

SELECT REGEXP_INSTR('Invoice 4529 approved', '\d+', 1, 1, 1) AS EndPosition;

If the digits span positions 9–12, the result is:

13

Find the Second Occurrence

SELECT REGEXP_INSTR('A12B34C56', '\d+', 1, 2) AS Position2;

Output:

4

Case-Insensitive Locate

SELECT REGEXP_INSTR('DepartmentSales', 'sales', 1, 1, 0, 'i');

Example: Locate Delimiters

Find the position of the first hyphen in a product code:

SELECT REGEXP_INSTR('ABC-2025-XYZ', '-', 1, 1);

Result:

4

REGEXP_COUNT(): Counting Occurrences

Syntax

REGEXP_COUNT(expression, pattern [, flags])

Purpose

Returns how many times a pattern appears in the string.


Basic Counting Example

SELECT REGEXP_COUNT('A12B34C56', '\d+') AS CountOfNumericGroups;

Output:

3

Count Email Usernames with Digits

SELECT REGEXP_COUNT(Email, '\d') AS DigitCount
FROM Employees;

Count Specific Word Occurrences

SELECT REGEXP_COUNT(Notes, '\berror\b') AS ErrorCount
FROM SystemLogs;

Count Uppercase Characters

SELECT REGEXP_COUNT('ABcDEf', '[A-Z]');

Result:

4

Advanced Enterprise Uses

1. Detect multiple ID sections

SELECT REGEXP_COUNT(DocumentId, '[A-Z]{2}[0-9]{4}') AS SegmentCount
FROM Documents;

2. Auditing repeated delimiters

Detect extra commas:

SELECT REGEXP_COUNT(AddressLine, ',') AS CommaCount
FROM CustomerAddresses;

3. Count tagging or markers in logs

SELECT REGEXP_COUNT(LogEntry, '\[INFO\]') AS InfoTags
FROM AuditLog;

4. Identify multiple decimal points (invalid numbers)

SELECT *
FROM DataImport
WHERE REGEXP_COUNT(ValueText, '\.') > 1;

Performance Considerations

REGEXP_INSTR and REGEXP_COUNT are scalar operations:

• Not SARGable
• Best used for validation, ETL, cleansing, or analytics
• Avoid per-row execution in high-frequency OLTP paths
• Flags add overhead, especially multi-line scanning
• Best performance is achieved when evaluating VARCHAR(n) instead of NVARCHAR(MAX)

Both functions benefit from SQL Server 2025’s:

• pattern caching
• optimized character scanning
• reduced backtracking logic


Workshop: Hands-On With REGEXP_INSTR and REGEXP_COUNT

Step 1. Create a sample table

CREATE TABLE TextAnalysis2025 (
    Id INT IDENTITY,
    InputValue VARCHAR(200)
);

INSERT INTO TextAnalysis2025 (InputValue)
VALUES
('Invoice 4529 approved'),
('Order 2025-11-17 processed'),
('ABC-2025-XYZ'),
('Line item: 12, 45, 78'),
('No digits present here');

Step 2. Find the first numeric position

SELECT Id,
       REGEXP_INSTR(InputValue, '\d+') AS NumberPosition
FROM TextAnalysis2025;

Step 3. Find the second numeric occurrence

SELECT Id,
       REGEXP_INSTR(InputValue, '\d+', 1, 2) AS SecondNumericPosition
FROM TextAnalysis2025;

Step 4. Find first hyphen

SELECT Id,
       REGEXP_INSTR(InputValue, '-', 1, 1) AS HyphenPos
FROM TextAnalysis2025;

Step 5. Count numeric groups

SELECT Id,
       REGEXP_COUNT(InputValue, '\d+') AS NumericGroups
FROM TextAnalysis2025;

Step 6. Count individual digits

SELECT Id,
       REGEXP_COUNT(InputValue, '\d') AS DigitCount
FROM TextAnalysis2025;

Step 7. Identify rows without digits

SELECT Id, InputValue
FROM TextAnalysis2025
WHERE REGEXP_COUNT(InputValue, '\d') = 0;

Final Thoughts

REGEXP_INSTR and REGEXP_COUNT give SQL Server the ability to precisely locate and quantify patterns inside text. These functions are essential for validation rules, ETL pipelines, error detection, and log parsing. Together with REGEXP_LIKE, SUBSTR, and REPLACE, SQL Server 2025 now includes a comprehensive, native RegEx toolkit that dramatically improves text processing inside the database engine.

Tomorrow’s article, Day 9, covers REGEXP_MATCHES() and REGEXP_SPLIT_TO_TABLE(), two powerful functions for extracting all matches and splitting strings using RegEx logic.


References

• REGEXP_INSTR Documentation
https://learn.microsoft.com/sql/t-sql/functions/regexp-instr-transact-sql

• REGEXP_COUNT Documentation
https://learn.microsoft.com/sql/t-sql/functions/regexp-count-transact-sql

• Regular Expression Reference
https://learn.microsoft.com/sql/t-sql/functions/regular-expression-reference

• SQL Server 2025 New Features
https://learn.microsoft.com/sql/sql-server/what-s-new-in-sql-server-2025


Discover more from SQLYARD

Subscribe to get the latest posts sent to your email.

Leave a Reply

Discover more from SQLYARD

Subscribe now to keep reading and get access to the full archive.

Continue reading