Previous Next
Using GROUP BY and HAVING in SQL
Bookmark this page
The GROUP BY statement groups rows in different categories where the aggregation functions can be applied on the rows
of a category independently. For example, you can find the number of employees in each department using the GROUP BY
statement on the department.
The HAVING clause is used to filter grouped data using conditions calculated with the aggregate functions. It is used along
with the GROUP BY statement.
Syntax of GROUP BY:
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s);
Syntax of GROUP BY with HAVING:
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
HAVING condition;
One thing to keep in mind when using aggregates in SQL is that any column in the SELECT list must be either an aggregate
or be listed in the GROUP BY clause.
Consider the table EMPLOYEE2 in the previous example.
The following query will return the number of employees in each department.
SELECT department, COUNT(*) as total FROM employee2 GROUP BY department;
The above query will return the following result:
Table 2-46
department total
Finance 2
HR 2
S l 1
Sales 1
The following query will return the number of employees in each department where the number of employees in the
department is more than 1.
SELECT
department,
COUNT(*) as total
FROM employee2 GROUP BY department HAVING count(*)>1;
The above query will return the following result:
Table 2-47
department total
Finance 2
HR 2
The following query will return the total number of employees as well as the minimum, maximum, and average salary of
the employees in each department.
SELECT
department,
COUNT(*) as total_employee,
MIN(salary) as min_salary,
MAX(salary) as max_salary,
AVG(salary) as average_salary
FROM Employee2 GROUP BY department;
The above query will return the following result:
Table 2-48
department total_employee min_salary max_salary average_salary
Finance 2 30000 40000 35000
HR 2 25000 35000 30000
Sales 1 50000 50000 50000
Watch the following videos by Ben Forta to learn about aggregation in SQL.
Videos:
Using GROUP BY to group data in SQL
And to do this, you have to add a
GROUP BY
to your SELECT statement
and GROUP BY takes a unique column
name
or multiple names to group by.
So this is a good time to once again
revisit the SELECT statement.
Privacy Policy | Terms of Service
We now added additional clauses,
so let's look at the order.
SELECT column comes first,
FROM table,
Copyright © 2024 Pearson Education Inc. or its affiliate(s). All rights reserved.
WHERE in your condition,
then the GROUP BY if there is one,
and then your ORDER BY.
So it has to be in that order.
And if you are gonna use a GROUP BY
0:00 / 0:39 1.0x and ORDER BY,
th GROUP BY t fi t
Using HAVING to filter items in a group of data in SQL
So what if you wanted to not just get
the number
of products made by a vendor but only
return those vendors that make a total
number of products?
It's kind of like a WHERE clause,
but it's a little different.
This is WHERE HAVING comes in,
and HAVING is just like WHERE,
but unlike the WHERE clause,
which functions at the row level,
HAVING functions at the group level,
which means once again
we should revisit the select statement.
So now SELECT column, FROM table,
WHERE and your condition,
GROUP BY and the columns you group
them by
Previous Next
All Rights Reserved
Previous Next
Pivoting
Bookmark this page
Consider there is a large dataset containing the employees records stored in a table and you want to find the number of
employees in each department. In the aggregate data example above, Table 2-46 shows the number of employees in each
department is the summary information calculated from the employee table Table 2-44. This is called pivoting.
In data analytics, pivoting is a process used to summarize data in a large dataset stored in a table so that an analyst can
see trends in the data.
Excel provides a powerful tool called PivotTable that can be used to calculate and summarize data for analysis. The
following are the steps to create a pivot table in Excel.
1. Click anywhere in the table and select Insert -> PivotTable
2. The Create PivotTable dialog will be opened. Click OK
Figure 2-17
3. A new worksheet will be opened that allows you to select the pivot fields. Here, we have selected the department and
salary field. By default, it is showing the sum of the salary of each department.
Figure 2-18
4. By default, the calculation is the sum, but it can be changed to the count, min, or max.
Select Value Field Settings for the VALUE dropdown.
Figure 2-19
Privacy Policy | Terms of Service
Copyright © 2024 Pearson Education Inc. or its affiliate(s). All rights reserved.
Choose your calculation (Sum, Count, Average, Max, Min, Product) from the Value Field Settings dialog box and click
OK
Figure 2-20
Previous Next
All Rights Reserved
Data Analytics TEVTA 2024 Help
Course Progress Dates Discussion Notes Video Library
Course / Lesson 2 / Aggregating Data
Previous Next
Aggregate functions
Bookmark this page
An aggregate function is a function that performs a calculation on multiple values and returns a single value. Some of the
most common and frequently used aggregation functions are as follows:
Table 2-43
Returns the number of records.
COUNT
Returns the total sum of values in a numeric column.
SUM
Returns the smallest value in a column.
MIN
Returns the largest value in a column.
MAX
Returns the average of all the values in a column.
AVG
NULL values are ignored when performing most aggregate functions. However, NULL values in the COUNT function are not
ignored.
Previous Next
All Rights Reserved
Hide Notes
Previous Next
Using aggregate functions in SQL
Bookmark this page
An aggregate function in SQL returns a single value that is calculated from the multiple values in a column. The following
are some of the frequently used aggregation functions in SQL:
COUNT()
SUM()
MIN()
MAX()
AVG()
The COUNT() function returns the number of records that satisfy a specific condition.
Syntax of COUNT function:
SELECT COUNT(column_name)FROM table_name WHERE condition;
The SUM() function returns the sum of values in a numeric column.
Syntax of SUM function:
SELECT SUM(column_name)FROM table_name WHERE condition;
The MIN() function returns the smallest value in a column.
Syntax of MIN function:
SELECT MIN(column_name)FROM table_name WHERE condition;
The MAX() function returns the largest value in a column.
Syntax of MAX function:
SELECT MAX(column_name)FROM table_name WHERE condition;
The AVG() function returns the average of all the values in a column.
Syntax of AVG function:
SELECT AVG(column name)FROM table name WHERE condition;
SELECT AVG(column_name)FROM table_name WHERE condition;
Consider the table EMPLOYEE2 below that has five columns - id, name, age, salary, and department.
Table 2-44
id name age salary department
101 Tom 40 40000 Finance
102 Nir 30 30000 Finance
103 Rick 25 25000 HR
104 Maria 35 35000 HR
Privacy Policy | Terms of Service
105 Lara 50 50000 Sales
Copyright © 2024 Pearson Education Inc. or its affiliate(s). All rights reserved.
The following query will return these values:
The total number of employees
The sum of the salaries of all the employees
The minimum salary of the employees
The maximum salary of the employees
The average salary of the employees
SELECT
count(*) as total_employee,
sum(salary) as total_salary,
min(salary) as min_salary,
max(salary) as max_salary,
avg(salary) as average_salary
FROM employee2;
The above query will return the following result:
Table 2-45
total_employee total_salary min_salary max_salary average_salary
5 180000 25000 50000 36000
Previous Next
All Rights Reserved