0% found this document useful (0 votes)
57 views

Subquery

The document discusses subqueries in SQL. It defines a subquery as a SELECT query nested inside another query. Some key points: - Subqueries allow performing operations that would otherwise require multiple queries in a single query, improving maintainability and performance. - Subqueries can be used in the WHERE clause to filter records based on the results of an inner query. - The subquery must return a single value when used with =, but can return multiple values when used with IN. - Subqueries can also be used in the FROM clause, treating the results as a derived table/inline view that can be joined to other tables.

Uploaded by

Amol Adhangale
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

Subquery

The document discusses subqueries in SQL. It defines a subquery as a SELECT query nested inside another query. Some key points: - Subqueries allow performing operations that would otherwise require multiple queries in a single query, improving maintainability and performance. - Subqueries can be used in the WHERE clause to filter records based on the results of an inner query. - The subquery must return a single value when used with =, but can return multiple values when used with IN. - Subqueries can also be used in the FROM clause, treating the results as a derived table/inline view that can be joined to other tables.

Uploaded by

Amol Adhangale
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 10

What is a Subquery?

A subquery is a SELECT query inside another query. It’s also called a nested query
in SQL.

Let’s say you had a simple query like this:

SELECT id, last_name


FROM employee;
This is a standard SELECT query that shows you some data from the employee table.

Now, let’s say you had a query like this:

SELECT id, last_name


FROM employee
WHERE salary > (
SELECT salary
FROM some_other_table
);
There are a few extra features in this query that you might not be used to. But the
main thing to point out is that there is a subquery in this query. The subquery is:

SELECT salary
FROM some_other_table
That’s the subquery because it’s a SELECT query inside another query.

Why Use SQL Subqueries?


The main reason to use a subquery is to improve the maintainability and performance
of your query and application.

Let’s say you had to display a report in your application that showed all employees
that had an above average salary. To find these records, you would need to do two
things:

Find the average salary of all employees


List all employees whose salary is larger than that average
Our employee table looks like this:
ID LAST_NAME SALARY
1 SMITH 40000
2 ANDERSON 60000
3 JONES 45000
4 LANDY 82000
5 CHARLESTON 21000
6 JOHNSON 51000
7 BROWN 93000
8 HARDEN 29000
9 CONNOR 52000
10 PIERCE 39000
First, let’s find the average salary of all employees, using an aggregate function:

SELECT AVG(salary)
FROM employee;
you should get this result:

AVG(SALARY)
51200
The average salary is 51,200. Now, we can use this value to find all of the
employees whose salary is larger than this average of 51,200.

SELECT id, last_name, salary


FROM employee
WHERE salary > 51200;
The results of this query are:

ID LAST_NAME SALARY
2 ANDERSON 60000
4 LANDY 82000
7 BROWN 93000
9 CONNOR 52000
This gives us the result we want. However, this is done in two steps.

Using a subquery will let you do this in one step. That means you won’t need to
adjust the query each time a new employee is entered to update the average salary.
It will run correctly every time.

Subqueries in a WHERE Clause


One place where you can use subqueries is in the WHERE clause.

Using a subquery in a WHERE clause means that we want to use the results of a query
as a WHERE clause for another query.

Example: Find Employees with Above-Average Salary


An example of this is finding all employees with a salary greater than the average
employee, which we saw how to do earlier in two separate steps.

We can do this in a single step using a subquery. Our query would look like this:

SELECT id, last_name, salary


FROM employee
WHERE salary > (
SELECT AVG(salary)
FROM employee
);
Let’s break this query down:

Line 1: this shows the SELECT clause. These are the columns that are displayed when
the query is run: id, last_name, and salary.
Line 2: this is the FROM clause. We are selecting data from the employee table.
Line 3: This is the WHERE clause. We want to see records where the salary is
greater than something. We open a bracket here, which is closed later in the query.
Line 4: This is another SELECT clause, which selects the AVG of the salary column.
This is the start of the subquery – a query within a query.
Line 5: Another FROM clause, which is related to the subquery. It’s also the
employee table.
Line 6: We close the brackets for the subquery, and end with a semicolon.
In this query, the bold part is the subquery, or inner query. The unbolded part is
the outer query. The subquery is also indented, to make it easier to ready and
easier to identify in your script that it’s a subquery.

SELECT id, last_name, salary


FROM employee
WHERE salary > (
SELECT AVG(salary)
FROM employee
);
If you run this query, you’ll get this result:

ID LAST_NAME SALARY
2 ANDERSON 60000
4 LANDY 82000
7 BROWN 93000
9 CONNOR 52000
It’s the same result as running it in separate queries, which we saw earlier. But
it’s run in a single query, which means it’s easier for you to run and will
properly cater to any changes in the data.

Example: Find Employee with the Highest Salary


Another way we can use a subquery is to find a record that exactly matches the
result of a subquery. An example of this would be finding the employee that has the
highest salary, which is a common type of query no matter what data you have (find
the details of the row with the highest value).

To do this using a subquery, it would look like this:

SELECT id, last_name, salary


FROM employee
WHERE salary = (
SELECT MAX(salary)
FROM employee
);
This is similar to the earlier example. However, there are some differences.

The subquery finds the maximum salary from the employee table: SELECT MAX(salary).

Also, the WHERE clause in the outer query says WHERE salary =. This means that the
salary of the employee needs to be equal to the maximum, which is from the
subquery.

If you run this query, you’ll get this result:

ID LAST_NAME SALARY
7 BROWN 93000
This shows Brown with the highest salary. You can check this by running a query
using ORDER BY:

SELECT id, last_name, salary


FROM employee
ORDER BY salary DESC;
ID LAST_NAME SALARY
7 BROWN 93000
4 LANDY 82000
2 ANDERSON 60000
9 CONNOR 52000
6 JOHNSON 51000
3 JONES 45000
1 SMITH 40000
10 PIERCE 39000
8 HARDEN 29000
5 CHARLESTON 21000
This shows that Brown does have the highest salary.

Example: Multiple Values


The previous example worked because the subquery only returned a single value:

SELECT id, last_name, salary


FROM employee
WHERE salary = (
SELECT MAX(salary)
FROM employee
);
When you specify “WHERE salary =”, the value you’re comparing the salary to needs
to be a single value. This means the subquery needs to return a single value.
Luckily, using MAX or AVG will ensure a single value is returned. However, what
happens when your subquery does not return a single value?

Look at this example:

SELECT id, last_name, salary


FROM employee
WHERE salary = (
SELECT salary
FROM employee
WHERE last_name LIKE 'C%'
);
This will look for all salaries that are the same as those employees whose
last_names start with “C”. If you run this query, you’ll get this error:

ORA-01427: single-row subquery returns more than one row

This means that your subquery has returned more than one row. The subquery can be
run by itself, which shows these results:

SELECT salary
FROM employee
WHERE last_name LIKE 'C%';
SALARY
21000
52000
If these values were used in your query instead of the subquery, it would look like
this:

SELECT id, last_name, salary


FROM employee
WHERE salary = (21000, 52000);
You can’t have a salary that is equal to two values. One of the rules of using a
subquery is that if you use an equals sign, you must return a single value.

That’s why you get the ORA-01427: single-row subquery returns more than one row.
Your subquery has returned more than one row, and the outer query expects a single
value (because of an = sign). Resolve this by using an IN operator of changing your
subquery.

Example: IN Operator
You can match several values in a subquery by using an IN operator. The IN operator
will check that the value you specify matches any of the values inside the IN
operator.

For example, the earlier query can be written like this:

SELECT id, last_name, salary


FROM employee
WHERE salary IN (21000, 52000);
Now, this isn’t a subquery, but you can write a query with a subquery in the same
way:

SELECT id, last_name, salary


FROM employee
WHERE salary IN (
SELECT salary
FROM employee
WHERE last_name LIKE 'C%'
);
All we’ve done here is replace the = sign with an IN clause, and the query will
work:

ID LAST_NAME SALARY
5 CHARLESTON 21000
9 CONNOR 52000
There are all kinds of uses for subqueries. Whenever you think you need to use the
result of one query as an input or a filter in another query, a subquery is
probably the best way to do it.

Inline Views: Subqueries in a FROM Clause


Another place you can use a subquery is in a FROM clause. The results of a subquery
are treated just like a table, which you can select values from.

Let’s say you had this query which showed you the average salary per department:
SELECT dept_id, ROUND(AVG(salary), 2) AS avg_salary
FROM employee
GROUP BY dept_id;
The results are:
DEPT_ID AVG_SALARY
1 55333.33
2 51333.33
3 48000
You have two columns here: dept_id and avg_salary, with three results Let’s say you
wanted to treat these results as a table without actually creating a table. You can
do this by using this query as a subquery in the from clause.
SELECT dept_id, avg_salary
FROM (
SELECT dept_id, ROUND(AVG(salary), 2) AS avg_salary
FROM employee
GROUP BY dept_id
);
Running this query will give you the same results:
DEPT_ID AVG_SALARY
1 55333.33
2 51333.33
3 48000
However, you can use the results of this subquery to join to other tables:
SELECT
sub.dept_id,
d.dept_name,
sub.avg_salary
FROM
(
SELECT dept_id, ROUND(AVG(salary), 2) AS avg_salary
FROM employee
GROUP BY dept_id
) sub
INNER JOIN department d ON sub.dept_id = d.id;
This query does a few things.

First, we’re selecting columns from two tables: the sub and d tables. The sub table
is actually the result of the subquery, which is the dept_id and avg_salary
columns.

We are then joining that subquery to the department table. The subquery has been
named sub, and it is then treated just like a table or a view. This type of query
is called an inline view.

What is an inline view? It’s a subquery that’s inside the FROM clause of a query.
It’s called an inline view because it acts just like a view object, but no view
object is created on the database.

If you run this query, you’ll get this result:

DEPT_ID DEPT_NAME AVG_SALARY


1 SALES 55333.33
2 HR 51333.33
3 IT 48000
This way you can see the dept_name for each department, and their average salary.
There may be other ways to get this data, but this is just an example of using a
subquery in the FROM clause and using a join to another table.

Subqueries in a SELECT Clause


One more place you can use a subquery is in the SELECT clause.

If a subquery in the WHERE clause acts as a filter, and a subquery in the FROM
clause acts as a view, then a subquery in the SELECT clause acts like a column that
is to be shown in your output.

To use a subquery in your SELECT clause, you add it in the place of a column.

For example, let’s say you wanted to show the average salary alongside each
employee record.

You could try a query like this:

SELECT id, last_name, salary, AVG(salary)


FROM employee;
However, if you run this query, you’ll get this error:

ORA-00937: not a single-group group function

This happens because we have an aggregate function and no GROUP BY clause.


We can adjust our query to use a subquery. The subquery would get the average
salary, and that would be used as a “column” in the SELECT clause:

SELECT id, last_name, salary, (


SELECT AVG(salary)
FROM employee
)
FROM employee;
The result from the SELECT AVG(salary) subquery is used as a value on every row
that is returned. The results are:

ID LAST_NAME SALARY (SELECTAVG(SALARY)FROMEMPLOYEE)


1 SMITH 40000 51200
2 ANDERSON 60000 51200
3 JONES 45000 51200
4 LANDY 82000 51200
5 CHARLESTON 21000 51200
6 JOHNSON 51000 51200
7 BROWN 93000 51200
8 HARDEN 29000 51200
9 CONNOR 52000 51200
10 PIERCE 39000 51200
You can assign a column alias using the AS keyword to make it neater:

SELECT id, last_name, salary, (


SELECT AVG(salary)
FROM employee
) AS avg_salary
FROM employee;
This query will now show this result:

ID LAST_NAME SALARY AVG_SALARY


1 SMITH 40000 51200
2 ANDERSON 60000 51200
3 JONES 45000 51200
4 LANDY 82000 51200
5 CHARLESTON 21000 51200
6 JOHNSON 51000 51200
7 BROWN 93000 51200
8 HARDEN 29000 51200
9 CONNOR 52000 51200
10 PIERCE 39000 51200
The same value of 51200 is shown for every row. If that’s your requirement, then
you can use a subquery in this way to show the value.

This alias is required in MySQL but not in other SQL vendors.

Multiple Subqueries (Nested Subqueries)


Now, what if you wanted to use a subquery inside another subquery? This is possible
in SQL. In fact, the maximum number of subqueries inside other subqueries you can
use is 255. However, you shouldn’t even get close to that many subqueries. If
you’ve got more than about 5 subqueries then you should look to redesign your
query.

Let’s say you wanted to see each employee’s salary and the percentage of that
salary compared to the average. You can use the earlier query, which showed each
employee along with the average salary for all employees:
SELECT id, last_name, salary, (
SELECT AVG(salary)
FROM employee
) AS avg_salary
FROM employee;
ID LAST_NAME SALARY AVG_SALARY
1 SMITH 40000 51200
2 ANDERSON 60000 51200
3 JONES 45000 51200
4 LANDY 82000 51200
5 CHARLESTON 21000 51200
6 JOHNSON 51000 51200
7 BROWN 93000 51200
8 HARDEN 29000 51200
9 CONNOR 52000 51200
10 PIERCE 39000 51200
You could adjust your query like this to show the percentage of the average salary:
SELECT id, last_name, salary, (
SELECT AVG(salary)
FROM employee
) AS avg_salary,
ROUND(salary / (
SELECT AVG(salary)
FROM employee
), 2) AS pct_avg_salary
FROM employee;
However, this is running the same subquery twice, which is a waste. The results
are:

ID LAST_NAME SALARY AVG_SALARY PCT_AVG_SALARY


1 SMITH 40000 51200 .78
2 ANDERSON 60000 51200 1.17
3 JONES 45000 51200 .88
4 LANDY 82000 51200 1.6
5 CHARLESTON 21000 51200 .41
6 JOHNSON 51000 51200 1
7 BROWN 93000 51200 1.82
8 HARDEN 29000 51200 .57
9 CONNOR 52000 51200 1.02
10 PIERCE 39000 51200 .76
You could write this as a nested subquery. You use the original results as a
subquery into another query:

SELECT id,
last_name,
salary
avg_salary,
ROUND(salary/avg_salary, 2) AS pct_avg_salary
FROM (
SELECT id, last_name, salary, (
SELECT AVG(salary)
FROM employee
) AS avg_salary
FROM employee
);
This query will find all employee records, show the average salary for each of
them, and then use the results of that in the outer query.
This shows that you can use several levels of subqueries.

Correlated Subqueries
The final concept on subqueries that I’ll cover is a correlated subquery.

What’s a correlated subquery? It’s when a subquery refers to a column that exists
in the outer query. The subquery and the outer query are said to be correlated, as
they are linked to each other.

So far, our subqueries are independent queries, with the results used inside an
outer query. However, you can refer to a column in the outer query from within the
subquery.

An example of this would be a query to find the employees with a salary greater
than the average salary in their department.

To find the average salary in the department, we can use this query:

SELECT dept_id, AVG(salary)


FROM employee
GROUP BY dept_id;
The results are:

DEPT_ID AVG(SALARY)
1 55333.33333
2 51333.33333
3 48000
Now, to find the employees with a salary greater than the average salary in their
department, we can use a subquery. The subquery needs to match one of these
averages here to the employee record, based on their department ID.

SELECT e.id, e.last_name, e.salary, e.dept_id


FROM employee e
WHERE e.salary > (
SELECT AVG(s.salary)
FROM employee s
WHERE s.dept_id = e.dept_id
GROUP BY s.dept_id
);
This query is different to the others. It selects a few columns from the employee
table, with an alias of “e”. It then has a subquery that looks at the average of
the salary value.

However, this is not just a simple average. It looks for the average where the
dept_id matches the employee’s dept_id from the outer query.

This allows the query to find the average for each employee’s department, and check
if their salary is greater than this average.

The results of this query are:

ID LAST_NAME SALARY DEPT_ID


2 ANDERSON 60000 3
4 LANDY 82000 1
6 JOHNSON 51000 3
7 BROWN 93000 2
9 CONNOR 52000 3
These are the only employees that have a salary greater than their department’s
average. There may be employees with a salary greater than averages of other
departments, but the correlated subquery lets you find results based on the related
records: in this case, the average salary of a department ID.

Conclusion
Subqueries are powerful tools to use in your SQL queries. It adds to the complexity
of a query, but if your requirements say that you need a certain data set, often
the only way to do that is using a subquery.

You might also like