0% found this document useful (0 votes)
80 views

@DataScience Ir 13 SQL Statements for 90% of Your Data Science Tasks

Uploaded by

Farshad Asghari
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views

@DataScience Ir 13 SQL Statements for 90% of Your Data Science Tasks

Uploaded by

Farshad Asghari
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

13 SQL Statements for 90% of Your Data

Analysis Tasks.
Abhishek Saud · Follow
13 min read · Mar 23

A programming language called Structured Query Language (SQL) was created


specifically for managing and modifying relational databases. Data scientists and
analysts use it frequently to glean insights from sizable datasets.

Filtering, sorting, grouping, and aggregating data are just a few of the many data
manipulation operations that can be carried out using SQL, which is a strong tool.
We’ll go over 13 key SQL statements in this article that are necessary for 90% of
the data science tasks you’ll be performing. These straightforward statements will
give you a strong foundation for working with SQL and are simple to comprehend
and implement.

This article will give you insightful knowledge and helpful advice for handling
data, whether you are new to SQL or have some experience with it.

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
1. SELECT
Data can be pulled from one or more database tables using the SELECT statement.
To filter, sort, and group data using various functions like WHERE, ORDER BY,
and GROUP BY, you should become proficient with using SELECT. A SELECT
statement is demonstrated by the following:

SELECT COL1, COL2,COL3


FROM table_name

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
WHERE conditions;

In this example, table_name is the name of the table that contains the data, and
COL1, COL2, and COL3 are the names of the columns from which you want to
retrieve the data. Although it is optional, the WHERE clause is used to specify a
requirement that must be satisfied for the query to successfully retrieve data.

Here’s an example that selects all records from a table called “employees” where
the employee’s age is greater than or equal to 35:

SELECT *
FROM employees
WHERE age >= 35;

2. WHERE
Data can be filtered using the WHERE statement according to a given condition.
You should become proficient in using WHERE to only retrieve data that satisfies
specific requirements.

Here is an illustration of how to filter data from a table using a “where” statement
in SQL:

Consider a table called “employees” that contains columns for “name,”


“department,” and “salary.” We can use a “where” statement to only choose
workers in the “Info_tech” department who earn more than $100,000 per year:

SELECT name, department, salary


FROM employees
WHERE salary > 100000 and department = 'Info_tech';

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
3. GROUP BY
Data can be grouped based on one or more columns using the GROUP BY
statement, and aggregate functions (like COUNT, SUM, and AVG) can be used to
determine summaries of the grouped data. You should become proficient with
GROUP BY if you want to categorize data.

Consider a table called “employees” that contains columns for “name,”


“department,” and “salary.” To group the employees by department and determine
the average salary for each department, we can use a GROUP BY statement:

SELECT department, AVG(salary)


FROM employees
GROUP BY department;

A list of all departments and their average salaries, determined by dividing the
total of all employee salaries by the number of employees in each department,
would be returned by this query. The employees are grouped by department using
the GROUP BY clause, and the average pay for each department is determined
using the AVG function.

department | avg_salary
-----------------------
Sales | 65000
Marketing | 55000
Engineering| 80000
Info_tech | 130000

4. JOIN

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
To combine data from two or more tables in a database, use the JOIN statement.
To retrieve data from multiple tables, you should become proficient at using JOIN
and specifying the correct type of join (e.g., INNER, LEFT, RIGHT, FULL OUTER).

Here are a few examples of JOIN statements:

INNER JOIN

Only the rows with a match between the columns in both tables are returned by
an INNER JOIN. Here’s an illustration:

SELECT o.order_id, c.customer_name


FROM orders AS o
INNER JOIN customers AS c
ON o.customer_id = c.customer_id;

The customer_id column is used in this example to join the tables for orders and
customers. Only when there is a match between the customer_id columns in both
tables will the resulting table contain the order_id and customer_name columns.

LEFT JOIN

With a LEFT JOIN, all of the rows from the left table are returned, along with any
matching rows from the right table. The result will have NULL values if the
appropriate table does not match. An illustration would be:

SELECT o.order_id, c.customer_name


FROM orders AS o
LEFT JOIN customers AS c
ON o.customer_id = c.customer_id;

In this illustration, the orders table is on the right and the customers table is on
the left. The columns are joined using the customer_id column. All the rows from
the customers table and their corresponding rows from the orders table will be

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
included in the final table. If there is no match in the orders table, NULL values
will appear in the order_id column.

RIGHT JOIN

All of the rows from the right table and the matching rows from the left table are
returned by a RIGHT JOIN. The result will contain NULL values if there is no
match in the left table. Here’s an illustration:

SELECT o.order_id, c.customer_name


FROM orders AS o
RIGHT JOIN customers AS c
ON o.customer_id = c.customer_id;

In this illustration, the customer table is on the right and the orders table is on the
left. The columns are joined using the customer_id column. All of the rows from
the orders table and their corresponding rows from the customers table will be
included in the final table. If there is no match in the customers table, NULL
values will appear in the customer_name column.

OUTER JOIN

All the rows from one or both tables, including the non-matching rows, are
returned using an OUTER JOIN in SQL. LEFT OUTER JOIN and RIGHT OUTER
JOIN are the two different kinds of OUTER JOINS.

Here’s an example of a LEFT OUTER JOIN:

SELECT o.order_id, c.customer_name


FROM orders AS o
LEFT OUTER JOIN customers AS c
ON o.customer_id = c.customer_id;

In this illustration, the orders table is on the right and the customers table is on
the left. The columns are joined using the customer_id column. All the rows from

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
the customers table and their corresponding rows from the orders table will be
included in the final table. If there is no match in the orders table, NULL values
will appear in the order_id column.

Here’s an example of a RIGHT OUTER JOIN:

SELECT o.order_id, c.customer_name


FROM orders AS o
RIGHT OUTER JOIN customers AS c
ON o.customer_id = c.customer_id;

In this illustration, the customer table is on the right and the orders table is on the
left. The columns are joined using the customer_id column. All of the rows from
the orders table and their corresponding rows from the customers table will be
included in the final table. If there is no match in the customers table, NULL
values will appear in the customer_name column.

It’s important to note that while some databases might not support RIGHT OUTER
JOINS, you can still get the same outcome by using an LEFT OUTER JOIN and
switching the order of the tables.

5. HAVING
Data is filtered using the HAVING statement after being grouped using the GROUP
BY statement. You should become proficient in using HAVING to filter grouped
data according to particular criteria.

Here is an illustration of how to use the SQL HAVING clause:

Consider a table called “orders” that contains the columns “order_id,”


“customer_id,” “product_id,” and “quantity.” We are looking for customers who
have placed at least 50 units of product orders overall. The total quantity of each
product ordered by each customer can be determined using the GROUP BY clause,
which allows us to group the orders by customer. Then, we can filter the

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
outcomes so that they only contain customers who have placed orders totaling at
least 50 units using the HAVING clause:

SELECT customer_id, SUM(quantity) AS total_quantity


FROM orders
GROUP BY customer_id
HAVING SUM(quantity) >= 50;

Only customers who have placed orders totaling at least 50 units will be included
in the list of all customers and the total number of products they have ordered.
The SUM function is used to determine the total number of products each
customer has ordered, the GROUP BY clause is used to group orders by customers,
and the HAVING clause is used to limit the results to only customers who have
placed orders totaling at least 50 units.

The output of the query would look something like this:

customer_id | total_quantity
---------------------------
123 | 60
456 | 70

According to this illustration, customer 123 ordered a total of 60 units of products,


while customer 456 ordered a total of 70 units. Both of these customers satisfy the
HAVING clause’s requirement that the total number of units purchased be at least
50.

6. Window Function
In SQL, window functions are employed to carry out calculations across a
collection of rows that are connected to the current row. A window — a subset of

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
rows from a table based on a given condition or partition — is where these
functions are applied. Here are a few SQL window function examples:

1. ROW_NUMBER(): This function gives each row in a partition a special


sequential number. The ROW_NUMBER() function has the following syntax:

SELECT column1, column2, ...,


ROW_NUMBER() OVER (ORDER BY column1) AS row_num
FROM table_name;

This query will return a result set with an additional column “row_num” that
contains the sequential numbers assigned to each row based on the order of
“column1”.

2. SUM(): This function calculates the sum of a column within a partition. The
syntax for the SUM() function is:

SELECT column1, column2, ...,


SUM(column3) OVER (PARTITION BY column1) AS column3_sum
FROM table_name;

This query will return a result set with an additional column “column3_sum” that
contains the sum of “column3” for each partition based on the values of
“column1”.

3. RANK(): This function assigns a rank to each row within a partition based on
the values of a specified column. The syntax for the RANK() function is:

SELECT column1, column2, ...,


RANK() OVER (PARTITION BY column1 ORDER BY column3 DESC) AS rank_num
FROM table_name;

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
This query will return a result set with an additional column “rank_num” that
contains the rank of each row within each partition based on the descending
order of “column3”.

4. AVG(): This function calculates the average of a column within a partition. The
syntax for the AVG() function is:

SELECT column1, column2, ...,


AVG(column3) OVER (PARTITION BY column1) AS column3_avg
FROM table_name;

This query will return a result set with an additional column “column3_avg” that
contains the average of “column3” for each partition based on the values of
“column1”.

Note that the syntax for window functions may vary depending on the specific
database management system (DBMS) being used.

7. UNION
To combine the output of two or more SELECT statements into a single result set
in SQL, use the UNION operator. The number of columns in the SELECT
statements must match, and the data types of the columns must be compatible.
The result set is automatically cleared of duplicate rows.

Here’s an example of using the UNION operator in SQL:

Consider two tables with the names “customers” and “employees,” respectively,
and columns for “name” and “city.” We want to compile a list of everyone who
resides in New York City, including both clients and staff. Two SELECT statements,
one for each table, can be combined using the UNION operator:

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
SELECT name, city
FROM customers
WHERE city = 'New York'
UNION
SELECT name, city
FROM employees
WHERE city = 'New York';

Customers and employees alike would be included in the list of people who reside
in New York City as a result of this query. The first SELECT statement returns all
New York City-based clients, and the second SELECT statement returns all New
York City-based personnel. These two SELECT statements’ results are combined,
and any duplicate rows are eliminated, by the UNION operator.

The output of the query would look something like this:

name | city
-------------------
John Smith | New York
Jane Doe | New York
Bob Johnson | New York
Samantha Lee| New York

In this example, we can see that there are four people who live in New York City,
two from the “customers” table and two from the “employees” table, and the
UNION operator has combined the results of the two SELECT statements into a
single result set.

8. CREATE
A new database table, view, or other database object can be created using the
CREATE statement. To create new tables, views, and other database objects, you
should become an expert at using CREATE. An illustration of how to use the
CREATE statement in SQL

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
Suppose we want to create a new table called “employees” with columns for “id”,
“name”, “email”, “phone” and “department”. We can use the CREATE statement to
do this:

CREATE TABLE employees(


id INT PRIMARY KEY,
name VARCHAR(50),
email VARCHAR(100),
phone VARCHAR(20),
department VARCHAR(20)
);

A new table called “customers” with the columns “id,” “name,” “email,” “phone”
and “department” would be created as a result of this query. The “id” column,
which is designated as an integer, serves as the table’s primary key. The “name”
column has a maximum character limit of 50, while the “email” and “phone”
columns have maximum character limits of 100 and 20, respectively.

After the query is executed, we can insert new rows into the “customers” table and
retrieve data from it:

INSERT INTO employees(id, name, email, phone, department)


VALUES (1, 'John Doe', '[email protected]', '555-555-1234', 'Engineering');

SELECT * FROM customers;

id | name | email | phone | department


-----------------------------------------------------------
1 | John Doe | [email protected] | 555-555-1234 | Engineering

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
In this example, we have used the CREATE statement to create a new table in a
database and inserted a new row into the table.

9. INSERT
Data is inserted into a database table using the INSERT statement. Learn how to
use INSERT to update a database table with new information. Here is an
illustration of how to use the SQL INSERT statement:

Consider a table called “students” that contains columns for “id,” “name,” “major,”
and “gpa.” The student with the ID 1134, name “Adam Fields,” major in “Data
Science,” and GPA of 3.7 needs to have a new row added to the table. This can be
accomplished using the INSERT statement:

INSERT INTO students (id, name, major, gpa)


VALUES (1134, 'Adam Fields', 'Data Science', 3.7);

This query would add a new row with the specified values for the “id,” “name,”
“major,” and “gpa” columns to the “students” table. The table name and the list of
columns into which values are to be inserted are both specified in the INSERT
statement. The values we want to insert into each column are then specified using
the VALUES keyword in the order that the columns were listed.

id | name | major | gpa


--------------------------------------------
1134 | Adam Fields | Data Science | 3.5

In this example, we have inserted a new row into the “students” table using the
INSERT statement.

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
10. UPDATE
The UPDATE statement is used to modify existing data in a database table. You
should master using UPDATE to update the values of one or more columns in a
table. Here’s an example of using the UPDATE statement in SQL:

Suppose we have a table named “students” with columns for “id”, “name”, “major”,
and “gpa”. We want to update the major and GPA of a student with an ID of 1134.
We can use the UPDATE statement to do this:

UPDATE students
SET major = 'Mathematics', gpa = 3.3
WHERE id = 1134;

This query would update the “major” and “gpa” columns of the row in the
“students” table with an ID of 1134. The UPDATE statement specifies the name of
the table we want to update, followed by the SET keyword and a list of column-
value pairs that we want to update. We then use the WHERE clause to specify
which rows we want to update. In this case, we want to update the row with an ID
of 1134, so we specify “WHERE id = 1134”.

After the query is executed, the “students” table would have the updated values for
the “major” and “gpa” columns in the row with an ID of 1134:

id | name | major | gpa


--------------------------------------------
1134 | Adam Fields | Mathematics | 3.3

In this example, we have updated the “major” and “gpa” columns of a row in the
“students” table using the UPDATE statement.

11. DELETE

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
A database table’s rows can be deleted using the DELETE statement. To remove
data from a table, you should become proficient with DELETE. Here is an
illustration of how to use the SQL DELETE statement:

Consider a table called “students” that contains columns for “id,” “name,” “major,”
and “gpa.” A student with the ID of 1134 needs to be removed from the table. To
accomplish this, we can use the DELETE statement.

DELETE FROM students


WHERE id = 1134;

The “students” table’s row with the ID 1134 would be deleted as a result of this
query. Following the WHERE clause to specify which rows we want to delete, the
DELETE statement specifies the name of the table we want to delete from. In this
instance, we specify “WHERE id = 1134” because we want to remove the row with
ID 1134.

12. DROP
To remove a database table or other database object, use the DROP statement. To
remove unnecessary tables or other objects from a database, you should become
proficient with the DROP command. Depending on the type of object being
deleted, the syntax for the DROP statement varies, but some typical examples
include:

1. DROP TABLE: This statement is used to delete an existing table along with all
its data and indexes. The syntax for the DROP TABLE statement is:

DROP TABLE table_name;

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
2. DROP INDEX: This statement is used to delete an existing index from a table.
The syntax for the DROP INDEX statement is:

DROP VIEW view_name;

4. DROP PROCEDURE: This statement is used to delete an existing stored


procedure. The syntax for the DROP PROCEDURE statement is:

DROP PROCEDURE procedure_name;

Keep in mind that depending on the particular database management system


(DBMS) being used, the DROP statement’s precise syntax may change. The DROP
statement should also be used with caution because it permanently deletes the
specified object along with all related data and indexes. A backup of your data
should be made before using the DROP statement.

13. ALTER
A database table’s or another database object’s structure can be changed using the
ALTER statement. To add or remove columns, alter data types, or change other
features of a table, you should become proficient with ALTER. Depending on the
kind of object being modified, the syntax for the ALTER statement varies, but
some typical examples include:

1. Using the ALTER TABLE statement, you can change a table’s structure by
adding or removing columns, altering the data types, or imposing constraints.
The ALTER TABLE statement has the following syntax:

ALTER TABLE table_name


ADD column_name data_type [constraint],

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com
MODIFY column_name data_type [constraint],
DROP column_name,
ADD CONSTRAINT constraint_name constraint_definition,
DROP CONSTRAINT constraint_name;

2. Using the ALTER INDEX statement, you can change an existing index’s structure
by adding or removing columns or by changing the index type. The ALTER INDEX
statement has the following syntax:

ALTER INDEX index_name


ADD column_name,
DROP column_name;

3. ALTER VIEW: This command is used to change a view’s definition, such as


altering the SELECT statement that was used to create it. The ALTER VIEW
statement has the following syntax:

ALTER VIEW view_name


AS select_statement;

Note that the exact syntax for the ALTER statement may vary depending on the
specific database management system (DBMS) being used.

If you like the article and would like to support me make sure to:

👏 Clap for the story (50 claps) to help this article be featured

Follow me on Medium

🔔 Follow Me: LinkedIn | GitHub | Twitter

MySQL Data Analysis Data Analyst Database

Convert web pages and HTML files to PDF in your applications with the Pdfcrowd HTML to PDF API Printed with Pdfcrowd.com

You might also like