SQL Query to Get Distinct Records Without Using Distinct Keyword
Last Updated :
31 Dec, 2024
Retrieving distinct records is a common task when working with databases. While the DISTINCT
clause is the standard approach to fetch unique rows, there are scenarios where you may need to achieve the same result without using it.
In this article, we explain various alternative methods to retrieve distinct records from a Microsoft SQL Server database table. These methods include using GROUP BY
, UNION
, INTERSECT
, and CTE
with the ROW_NUMBER()
function, accompanied by detailed explanations and practical examples.
Example of Finding Distinct Records Without Using Distinct Keyword
Let’s use the following dup_table
in the geeks
database to demonstrate the various methods for retrieving distinct records. This table contains duplicate rows that we aim to filter out using alternative techniques, ensuring a clear and unique dataset. These methods are practical and flexible for handling deduplication in real-world scenarios.

dup_table
Method 1: Using GROUP BY Clause
The GROUP BY
clause groups rows with the same values into aggregated rows, effectively removing duplicates when no aggregation function is used.
Query:
SELECT dup_id, dup_name FROM dup_table
GROUP BY dup_id, dup_name;
Output

Using Group By
Method 2: Using UNION Operator
The UNION
operator combines the result sets of two queries and removes duplicates. It ensures that the final output contains unique rows from both result sets, making it a powerful tool for deduplication and merging datasets.
Query:
SELECT dup_id, dup_name FROM dup_table
UNION
SELECT dup_id, dup_name FROM dup_table;
Output

Using UNION Operator
Method 3: Using INTERSECT Operator
The INTERSECT
operator returns only the rows that appear in both result sets, ensuring distinct records. It is particularly useful when comparing two datasets or when deduplication needs to be achieved based on overlapping values.
Query:
SELECT dup_id, dup_name FROM dup_table
INTERSECT
SELECT dup_id, dup_name FROM dup_table;
Output
dup_id |
dup_name |
1 |
yogesh |
2 |
ashish |
3 |
ajit |
4 |
vishal |
Method 4: Using CTE and ROW_NUMBER() Function
A Common Table Expression
(CTE) with the ROW_NUMBER()
function can be used to assign a unique row number to each duplicate group. Filtering for rows with ROW_NUMBER
= 1
eliminates duplicates.
Query:
WITH cte (dup_id, dup_name, dup_count)
AS
(SELECT dup_id, dup_name,
row_number() over (partition BY dup_id,
dup_name ORDER BY dup_id) AS dup_count
FROM dup_table)
SELECT * FROM cte WHERE dup_count = 1;
Output
dup_id |
dup_name |
1 |
yogesh |
2 |
ashish |
3 |
ajit |
4 |
vishal |
Conclusion
Retrieving distinct records without using the DISTINCT
clause is both possible and practical in SQL. Techniques like GROUP BY
, UNION
, INTERSECT
, and CTE
with ROW_NUMBER()
offer flexibility and cater to different use cases. By understanding these methods, we can optimise queries, handle database limitations, and perform advanced data manipulation tasks. Use the method that best suits our specific requirements and database environment to ensure accurate and efficient results.
Similar Reads
SQL | Remove Duplicates without Distinct
In SQL, removing duplicate records is a common task, but the DISTINCT keyword can sometimes lead to performance issues, especially with large datasets. The DISTINCT clause requires sorting and comparing records, which can increase the processing load on the query engine. In this article, weâll expla
4 min read
How to Use SQL DISTINCT and TOP in Same Query?
Structured Query Language (SQL) is a computer language used to interact with relational databases. It allows us to organize, manage, and retrieve data efficiently. In this article, we will explain how to use the DISTINCT keyword and the TOP clause together in a query, explaining their purpose, usage
4 min read
How to get distinct rows in dataframe using PySpark?
In this article we are going to get the distinct data from pyspark dataframe in Python, So we are going to create the dataframe using a nested list and get the distinct data. We are going to create a dataframe from pyspark list bypassing the list to the createDataFrame() method from pyspark, then by
2 min read
SQL Query to Get the Latest Record from the Table
Fetching the latest record from a table is a frequent and essential task when managing relational databases. Whether you want to retrieve all columns or a specific subset of them, SQL provides a variety of techniques to accomplish this efficiently. In this article, we will explain how to retrieve th
5 min read
SQL Query to Display First 50% Records from Employee Table
Here, we are going to see how to display the first 50% of records from an Employee Table in MS SQL server's databases. For the purpose of the demonstration, we will be creating an Employee table in a database called "geeks". Creating a Database : Use the below SQL statement to create a database call
2 min read
SQL Query to Exclude Records if it Matches an Entry in Another Table
In this article, we will see, how to write the SQL Query to exclude records if it matches an entry in another table. We can perform the above function using the NOT IN operator in SQL. For obtaining the list of values we can write the subquery. NOT IN operators acts as a negation of In operator and
3 min read
SQL Query to Print Name of Distinct Employee Whose DOB is Between a Given Range
Query in SQL is like a statement that performs a task. Here, we need to write a query that will print the name of the distinct employee whose DOB is in the given range. We will first create a database named âgeeksâ then we will create a table âdepartmentâ in that database. Creating a Database : Use
2 min read
SQL Query to Display Last 50% Records from Employee Table
Here, we are going to see how to display the last 50% of records from an Employee Table in MySQL and MS SQL server's databases. For the purpose of demonstration, we will be creating an Employee table in a database called "geeks". Creating a Database : Use the below SQL statement to create a database
2 min read
How to Get Multiple Counts With Single Query in SQL Server
In SQL Server, obtaining multiple counts with a single query is a common requirement, especially when we are analyzing data across different conditions. Whether we are tallying the number of active and inactive users or counting orders based on their status by using a single query can speed our data
4 min read
SQL Query to Add Unique key Constraints Using ALTER Command
Here we will see how to add unique key constraint to a column(s) of a MS SQL Server's database with the help of a SQL query using ALTER clause. For the demonstration purpose, we will be creating a demo table in a database called "geeks". Creating the Database : Use the below SQL statement to create
2 min read