Open In App

SQL | GROUP BY

Last Updated : 20 Jun, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

The SQL GROUP BY clause is a powerful tool used to organize data into groups based on shared values in one or more columns. It is most often used with aggregate functions like SUM, COUNT, AVG, MIN and MAX to perform summary operations on each group, helping us extract meaningful analysis from large datasets.

How to use GROUP BY in SQL

The GROUP BY statement in SQL is used to arrange identical data into groups based on specified columns. If a particular column has the same values in multiple rows, the GROUP BY clause will group these rows together. Whether we are analyzing sales by region, users by age group or orders by product category, the GROUP BY clause helps transform raw data into structured reports.

Key Points About GROUP BY

  • GROUP BY clause is used with the SELECT statement.
  • In a SQL query, the GROUP BY clause is placed after the WHERE clause, but before HAVING Clause and ORDER BY clause
  • WHERE clause filters rows before grouping, the HAVING clause is used to apply conditions on the grouped data
  • Order of Execution in a SQL query is SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY.

Syntax:

SELECT column1, function_name(column2)
FROM table_name
GROUP BY column1, column2

Parameters:

  1. function_name: Name of the function used for example, SUM() , AVG().
  2. table_name: Name of the table.
  3. condition: Condition used.

Examples of GROUP BY in SQL

Let's assume that we have two tables Employee and Student as follows. After adding two tables we will do some specific operations to learn about GROUP BY. Insert some random data into a table and then we will perform some operations in GROUP BY.

Employee Table

CREATE TABLE emp (
  emp_no INT PRIMARY KEY,
  name VARCHAR(50),
  sal DECIMAL(10,2),
  age INT
);

INSERT INTO emp (emp_no, name, sal, age) VALUES
(1, 'Aarav', 50000.00, 25),
(2, 'Aditi', 60000.50, 30),
(3, 'Aarav', 75000.75, 35),
(4, 'Anjali', 45000.25, 28),
(5, 'Chetan', 80000.00, 32),
(6, 'Divya', 65000.00, 27),
(7, 'Gaurav', 55000.50, 29),
(8, 'Divya', 72000.75, 31),
(9, 'Gaurav', 48000.25, 26),
(10, 'Divya', 83000.00, 33);
SELECT * from emp;

Output:

Screenshot-2024-06-11-213404
Emp TABLE

Student Table

CREATE TABLE student (
  name VARCHAR(50),
  year INT,
  subject VARCHAR(50)
);

INSERT INTO student (name, year, subject) VALUES
('Alice', 1, 'Mathematics'),
('Bob', 2, 'English'),
('Charlie', 3, 'Science'),
('David', 1, 'Mathematics'),
('Emily', 2, 'English'),
('Frank', 3, 'Science');

Output:

Screenshot-2024-06-11-214050
Student TABLE

Example 1 : Group By Single Column

Group By single column means, placing all the rows with the same value of only that particular column in one group. Consider the query for Calculating the Total Salary of each Employee by their name in the emp table.

Query:

SELECT name, SUM(sal) FROM emp 
GROUP BY name;

Output:

Screenshot-2024-06-11-213528
Output

Explanations:

As you can see in the above output, the rows with duplicate NAMEs are grouped under the same NAME and their corresponding SALARY is the sum of the SALARY of duplicate rows. The SUM() function of SQL is used here to calculate the sum. The NAMES that are added are Aarav, Divya and Gaurav.

Example 2 : Group By Multiple Columns

Group by multiple columns is say, for example, GROUP BY column1, column2. This means placing all the rows with the same values of columns column 1 and column 2 in one group. This SQL query groups student records by both SUBJECT and YEAR and then counts the number of records (i.e., students) in each of those groups.

Query:

SELECT SUBJECT, YEAR, Count(*)
FROM Student
GROUP BY SUBJECT, YEAR;

Output:

Screenshot-2024-06-11-214212
Output

Explantion:

  • As we can see in the above output students with same SUBJECT and YEAR are placed in the same group.
  • Those whose only SUBJECT is same but YEAR belong to different groups. So here we have grouped the table according to two columns or more than one column.
  • The Grouped subject and years are (English,2) , (Mathematics,1) and (Science,3). The above mentioned all groups and years are repeated twice.

HAVING Clause in GROUP BY Clause

We know that WHERE clause is used to place conditions on columns but what if we want to place conditions on groups? This is where the HAVING clause comes into use. We can use the HAVING clause to filter results based on the conditions applied to grouped data, such as sums, averages or counts.

One important point to note is that we cannot use aggregate functions like SUM(), COUNT(), AVG(), etc., directly in the WHERE clause. Instead, we use the HAVING clause when we need to filter data based on the result of these aggregate functions.

Syntax:

SELECT column1, function_name(column2)
FROM table_name
WHERE condition
GROUP BY column1, column2
HAVING condition
ORDER BY column1, column2;

Key Terms

  • function_name: Name of the function used for example, SUM() , AVG().
  • table_name: Name of the table.
  • condition: Condition used.

Example:

SELECT NAME, SUM(sal) FROM Emp
GROUP BY name
HAVING SUM(sal)>50000; 

Output:

Screenshot-2024-06-11-214759
Output

Explanation:

In the result, only employees whose total salary (SUM(sal)) exceeds 50,000 are displayed. For example, if Anjali has a total salary less than 50,000, she will be excluded from the output.

Conclusion

The GROUP BY function in SQL organizes identical data into groups, enabling aggregate analysis on each group. It is commonly used with aggregate functions like SUM(), COUNT(), AVG(), etc., to summarize data efficiently. The HAVING clause further refines the results by applying conditions to these grouped records. GROUP BY can operate on single or multiple columns, making it a versatile tool for data retrieval and reporting.


Next Article
Article Tags :

Similar Reads