Open In App

How to Validate SQL Query With Regular Expression?

Last Updated : 08 Jan, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Regular expressions (RegEx) are a powerful tool for pattern matching, commonly used for validating data in SQL queries. In SQL, regular expressions help validate complex data types such as email addresses, phone numbers, alphanumeric strings, and numeric values. While front-end validation is essential, back-end validation with SQL ensures that only valid data enters the system

This article explains how we can use regular expressions in SQL queries to validate and extract data like email addresses, phone numbers, and alphabets.

Why Use Regular Expressions in SQL?

Regular expressions in SQL allow for validating user input directly within SQL queries, which is important for maintaining data integrity. For example, when validating an email address or checking for numeric values within a string, regular expressions can quickly identify errors and inform users whether the input is valid or not. This ensures that only correctly formatted data gets processed in the backend.

Let’s use the GeekAuthors table to demonstrate various regular expression validation techniques. The table stores author information, including names, email addresses, age, skill sets, and the number of posts.

GeekAuthors

GeekAuthors

Step 1: Validating Email Addresses Using Regular Expressions

One of the most common validations is checking if an email address is correctly formatted. We can use SQL’s LIKE operator combined with a regular expression to check if the email follows a valid pattern.

SQL Query to Validate Emails:

-- Get all email address
SELECT AuthorEmail FROM GeekAuthors
-- Get ValidEmail Address
SELECT AuthorEmail AS ValidEmail
FROM GeekAuthors
WHERE AuthorEmail LIKE '%_@__%.__%'
AND PATINDEX('%[^a-z,0-9,@,.,_,\-]%', AuthorEmail) = 0;

Explanation:

  • The LIKE pattern '%_@__%.__%' checks that the email contains an “@” symbol and a period in the correct position.
  • The PATINDEX function ensures that only valid characters are present (letters, numbers, @, ., _, -).

SQL Query to Identify Invalid Emails:

-- Get invalid email addresses
SELECT AuthorEmail AS NotValidEmail
FROM GeekAuthors
WHERE NOT AuthorEmail LIKE '%_@__%.__%'
AND PATINDEX('%[^a-z,0-9,@,.,_,\-]%', AuthorEmail) = 0;

Output

Explanation:

This query selects all email addresses that do not match the valid pattern, helping to catch and report invalid entries like "phoebegmailcom" or "[email protected]".

Step 2: Locating Numeric Values in Alphanumeric Strings

Often, we may need to identify the position of numeric values within a string. Regular expressions can help detect the first numeric character in a string.

Way 1: Detecting Non-Alphabetic Characters

SELECT 'GeekPremierLeague2022' as ContestName,
PATINDEX('%[^A-Za-z]%', 'GeekPremierLeague2022') as
NumericCharacterPosition;

Explanation:

This query returns the position of the first non-alphabetic character in the string. In the example, it will return the position of the first numeric character (i.e., 2022).

Way 2: Detecting Numeric Values Directly

SELECT 'GeekPremierLeague2022' as ContestName, 
PATINDEX('%[0-9]%', 'GeekPremierLeague2022') as
NumericCharacterPosition;

Output

In case numeric is not available, we will get 0 as the answer

Output

Step 3: Extracting Only Alphabets from a String

In some cases, we might need to remove non-alphabetic characters and retain only the letters. We can achieve this by using the PATINDEX and STUFF functions. A regular expression has to be applied on PATINDEX. Using the above query we will get the numeric position. By using the STUFF function, we can replace all the numeric values.

Query:

DECLARE @inputData NVARCHAR(MAX) = 'GEEKS PREMIER LEAGUE 2022'       
--input string
DECLARE @intPosition INT
--get the position of the integer from the input string
SET @intPosition = PATINDEX('%[^A-Za-z]%', @inputData)
-- We can use PATINDEX('%[0-9]%') also
print @intPosition
--run loop until no integer is found in the input string
WHILE @intPosition > 0
BEGIN
--remove the integer from that position
SET @inputData = STUFF(@inputData, @intPosition, 1, '' )
--PRINT @inputData
SET @intPosition = PATINDEX('%[^A-Za-z]%', @inputData )
--Again get the position of the next integer in the input string
--PRINT @intPosition
END
SELECT 'GEEKS PREMIER LEAGUE 2022' as InputString, @inputData AS AlphabetsOnly

Output

Step 4: Extracting Numeric Values from a String

To get the numeric values alone suppose when the data feed is occurring that consists of employee name, phone number, address, and email id, we can use the below approach to retrieve the phone number (i.e. numeric value) in the below way. 

Query:

DECLARE @inputString  VARCHAR(MAX)
DECLARE @validchars VARCHAR(MAX)

SET @inputString = 'GeekAuthor1 123456789 Address1 [email protected]'
--We are going to take continuous set of numbers
and it should not have any spaces also in between
SET @validchars = '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
DECLARE @idx INT
SET @idx = PATINDEX('%'+ @validchars +'%',@inputString )
IF @idx > 0 AND
(@idx = LEN(@inputString )-8
OR PATINDEX(SUBSTRING(@inputString ,@idx+9,1),'[0-9]')=0)
SET @inputString =SUBSTRING(@inputString,
PATINDEX('%'+ @validchars +'%',@inputString ), 9)
ELSE SET @inputString = ''
SELECT @inputString as NumericAlone

Conclusion

Regular expressions in SQL provide a powerful mechanism for validating and extracting specific patterns from data. By using PATINDEX, LIKE, STUFF, and other SQL functions, we can easily validate email addresses, extract numeric values, or even filter out unwanted characters. These techniques are essential for backend validation, ensuring that only clean, well-formed data is processed, and errors are caught early.



Next Article

Similar Reads