40 SQL Interview Questions &
Solutions for DBAs
Asfaw Gedamu
July 2, 2025
40 SQL Interview Questions & Solutions For
DBAs
Your interviewer has 30 minutes. Your mission: write queries that show you can reason in data.
Not just recite syntax.
I pulled together 40 real-world SQL challenges that cover the exact patterns hiring managers
probe for:
Retention & Churn: month-over-month customer stickiness, six-month “silent” customers,
moving averages, and more
Salary & Rank Logic: second-highest salary, department pay gaps, in-department ranking
with RANK()
Revenue Analytics : Pareto 80 percent products, product-level revenue share, YoY growth
Window Magic: rolling 3-day sales, 90th-percentile spenders, consecutive-day purchases
Everyday Ops: duplicate detection, unsold products, employees hired on weekends
Now to the questions:
1. Calculate customer retention rate month-over-month
Input (Orders):
| customer_id | order_date |
|-------------|------------|
| 101 | 2023-01-15 |
| 101 | 2023-02-20 |
| 102 | 2023-01-10 |
| 103 | 2023-02-05 |
| 101 | 2023-03-01 |
Statement:
WITH monthly_customers AS (
SELECT DISTINCT
customer_id,
FORMAT(order_date, 'yyyy-MM') AS month
FROM Orders
),
retention AS (
SELECT m1.month,
COUNT(DISTINCT m1.customer_id) AS current_customers,
COUNT(DISTINCT m2.customer_id) AS retained_customers
FROM monthly_customers m1
LEFT JOIN monthly_customers m2 ON m1.customer_id = m2.customer_id
AND m2.month = DATEADD(MONTH, 1,
m1.month)
GROUP BY m1.month
)
SELECT month,
current_customers,
retained_customers,
ROUND(100.0 * retained_customers / NULLIF(current_customers,
0), 2) AS retention_rate
FROM retention
ORDER BY month;
Output:
| month | current_customers | retained_customers | retention_rate |
|---------|-------------------|--------------------|----------------|
| 2023-01 | 2 | 1 | 50.00 |
| 2023-02 | 2 | 1 | 50.00 |
| 2023-03 | 1 | 0 | 0.00 |
2. Retrieve the second highest salary from the Employee
table
Input (Employee):
| emp_id | name | salary |
|--------|--------|--------|
| 1 | Alice | 90000 |
| 2 | Bob | 75000 |
| 3 | Carol | 90000 |
| 4 | Dave | 60000 |
Statement:
SELECT MAX(salary) AS SecondHighestSalary
FROM Employee
WHERE salary < (SELECT MAX(salary) FROM Employee);
Output:
| SecondHighestSalary |
|---------------------|
| 75000 |
3. Find employees without department (Left Join usage)
Input (Employee):
| emp_id | name | department_id |
|--------|--------|---------------|
| 1 | Alice | 101 |
| 2 | Bob | NULL |
| 3 | Carol | 102 |
| 4 | Dave | NULL |
Input (Department):
| department_id | dept_name |
|---------------|-----------|
| 101 | Sales |
| 102 | Marketing |
Statement:
SELECT e.*
FROM Employee e
LEFT JOIN Department d ON e.department_id = d.department_id
WHERE d.department_id IS NULL;
Output:
| emp_id | name | department_id |
|--------|------|---------------|
| 2 | Bob | NULL |
| 4 | Dave | NULL |
4. Calculate the total revenue per product
| product_id | quantity | price |
|------------|----------|-------|
| 101 | 2 | 50 |
| 102 | 1 | 100 |
| 101 | 3 | 50 |
| 103 | 5 | 20 |
Statement:
SELECT product_id, SUM(quantity * price) AS total_revenue
FROM Sales
GROUP BY product_id;
Input (Sales):
Output:
| product_id | total_revenue |
|------------|---------------|
| 101 | 250 |
| 102 | 100 |
| 103 | 100 |
5. Get the top 3 highest-paid employees
Input (Employee):
| emp_id | name | salary |
|--------|--------|--------|
| 1 | Alice | 90000 |
| 2 | Bob | 75000 |
| 3 | Carol | 85000 |
| 4 | Dave | 60000 |
Statement:
SELECT TOP 3 *
FROM Employee
ORDER BY salary DESC;
Output:
| emp_id | name | salary |
|--------|-------|--------|
| 1 | Alice | 90000 |
| 3 | Carol | 85000 |
| 2 | Bob | 75000 |
6. Customers who made purchases but never returned
products
Input (Customers):
| customer_id | name |
|-------------|--------|
| 101 | Alice |
| 102 | Bob |
| 103 | Carol |
Input (Orders):
| order_id | customer_id |
|----------|-------------|
| 1 | 101 |
| 2 | 102 |
| 3 | 103 |
Input (Returns):
| return_id | customer_id |
|-----------|-------------|
| 1 | 101 |
Statement:
SELECT DISTINCT c.customer_id
FROM Customers c
JOIN Orders o ON c.customer_id = o.customer_id
WHERE c.customer_id NOT IN (SELECT customer_id FROM Returns);
Output:
| customer_id |
|-------------|
| 102 |
| 103 |
7. Show the count of orders per customer
Input (Orders):
| order_id | customer_id |
|----------|-------------|
| 1 | 101 |
| 2 | 101 |
| 3 | 102 |
| 4 | 101 |
| 5 | 103 |
Statement:
SELECT customer_id, COUNT(*) AS order_count
FROM Orders
GROUP BY customer_id;
Output:
| customer_id | order_count |
|-------------|-------------|
| 101 | 3 |
| 102 | 1 |
| 103 | 1 |
8. Retrieve all employees who joined in 2023
Input (Employee):
| emp_id | name | hire_date |
|--------|--------|------------|
| 1 | Alice | 2023-01-15 |
| 2 | Bob | 2022-11-20 |
| 3 | Carol | 2023-03-10 |
| 4 | Dave | 2021-05-05 |
Statement:
SELECT *
FROM Employee
WHERE YEAR(hire_date) = 2023;
Output:
| emp_id | name | hire_date |
|--------|-------|------------|
| 1 | Alice | 2023-01-15 |
| 3 | Carol | 2023-03-10 |
9. Calculate the average order value per customer
Input (Orders):
| order_id | customer_id | total_amount |
|----------|-------------|--------------|
| 1 | 101 | 100 |
| 2 | 101 | 150 |
| 3 | 102 | 75 |
| 4 | 101 | 200 |
Statement:
SELECT customer_id, AVG(total_amount) AS avg_order_value
FROM Orders
GROUP BY customer_id;
Output:
| customer_id | avg_order_value |
|-------------|-----------------|
| 101 | 150 |
| 102 | 75 |
10. Get the latest order placed by each customer
Input (Orders):
| order_id | customer_id | order_date |
|----------|-------------|------------|
| 1 | 101 | 2023-01-15 |
| 2 | 101 | 2023-02-20 |
| 3 | 102 | 2023-01-10 |
| 4 | 101 | 2023-03-05 |
Statement:
SELECT customer_id, MAX(order_date) AS latest_order_date
FROM Orders
GROUP BY customer_id;
Output:
| customer_id | latest_order_date |
|-------------|-------------------|
| 101 | 2023-03-05 |
| 102 | 2023-01-10 |
11. Find products that were never sold
Input (Products):
| product_id | product_name |
|------------|--------------|
| 101 | Laptop |
| 102 | Phone |
| 103 | Tablet |
| 104 | Monitor |
Input (Sales):
| sale_id | product_id | quantity |
|---------|------------|----------|
| 1 | 101 | 2 |
| 2 | 102 | 1 |
| 3 | 101 | 3 |
Statement:
SELECT p.product_id
FROM Products p
LEFT JOIN Sales s ON p.product_id = s.product_id
WHERE s.product_id IS NULL;
Output:
| product_id |
|------------|
| 103 |
| 104 |
12. Identify the most selling product
Input (Sales):
| sale_id | product_id | quantity |
|---------|------------|----------|
| 1 | 101 | 2 |
| 2 | 102 | 1 |
| 3 | 101 | 3 |
| 4 | 103 | 5 |
| 5 | 102 | 2 |
Statement:
SELECT TOP 1 product_id, SUM(quantity) AS total_qty
FROM Sales
GROUP BY product_id
ORDER BY total_qty DESC;
Output:
| product_id | total_qty |
|------------|-----------|
| 101 | 5 |
13. Get the total revenue and the number of orders per
region
Input (Orders):
| order_id | region | total_amount |
|----------|--------|--------------|
| 1 | East | 100 |
| 2 | West | 150 |
| 3 | East | 200 |
| 4 | North | 75 |
| 5 | East | 125 |
Statement:
SELECT region,
SUM(total_amount) AS total_revenue,
COUNT(*) AS order_count
FROM Orders
GROUP BY region;
Output:
| region | total_revenue | order_count |
|--------|---------------|-------------|
| East | 425 | 3 |
| West | 150 | 1 |
| North | 75 | 1 |
14. Count how many customers placed more than 5 orders
Input (Orders):
| order_id | customer_id |
|----------|-------------|
| 1 | 101 |
| 2 | 101 |
| 3 | 101 |
| 4 | 101 |
| 5 | 101 |
| 6 | 101 |
| 7 | 102 |
| 8 | 102 |
| 9 | 103 |
Statement:
SELECT COUNT(*) AS customer_count
FROM (
SELECT customer_id FROM Orders
GROUP BY customer_id
HAVING COUNT(*) > 5
) AS subquery;
Output:
| customer_count |
|----------------|
| 1 |
15. Retrieve customers with orders above the average order
value
Input (Orders):
| order_id | customer_id | total_amount |
|----------|-------------|--------------|
| 1 | 101 | 100 |
| 2 | 102 | 200 |
| 3 | 103 | 150 |
| 4 | 104 | 50 |
| 5 | 105 | 250 |
Statement:
SELECT *
FROM Orders
WHERE total_amount > (SELECT AVG(total_amount) FROM Orders);
Output:
| order_id | customer_id | total_amount |
|----------|-------------|--------------|
| 2 | 102 | 200 |
| 3 | 103 | 150 |
| 5 | 105 | 250 |
16. Find all employees hired on weekends
Input (Employee):
| emp_id | name | hire_date |
|--------|--------|------------|
| 1 | Alice | 2023-01-14 | -- Saturday
| 2 | Bob | 2023-01-16 | -- Monday
| 3 | Carol | 2023-01-15 | -- Sunday
| 4 | Dave | 2023-01-17 | -- Tuesday
Statement:
SELECT *
FROM Employee
WHERE DATENAME(WEEKDAY, hire_date) IN ('Saturday', 'Sunday');
Output:
| emp_id | name | hire_date |
|--------|-------|------------|
| 1 | Alice | 2023-01-14 |
| 3 | Carol | 2023-01-15 |
17. Find employees with salaries between $50,000 and
$100,000
Input (Employee):
| emp_id | name | salary |
|--------|--------|--------|
| 1 | Alice | 90000 |
| 2 | Bob | 45000 |
| 3 | Carol | 75000 |
| 4 | Dave | 110000 |
Statement:
SELECT *
FROM Employee
WHERE salary BETWEEN 50000 AND 100000;
Output:
| emp_id | name | salary |
|--------|-------|--------|
| 1 | Alice | 90000 |
| 3 | Carol | 75000 |
18. Get monthly sales revenue and order count
Input (Orders):
| order_id | date | amount |
|----------|------------|--------|
| 1 | 2023-01-15 | 100 |
| 2 | 2023-01-20 | 150 |
| 3 | 2023-02-05 | 200 |
| 4 | 2023-02-10 | 75 |
| 5 | 2023-03-01 | 125 |
Statement:
SELECT FORMAT(date, 'yyyy-MM') AS month,
SUM(amount) AS total_revenue,
COUNT(order_id) AS order_count
FROM Orders
GROUP BY FORMAT(date, 'yyyy-MM');
Output:
| month | total_revenue | order_count |
|---------|---------------|-------------|
| 2023-01 | 250 | 2 |
| 2023-02 | 275 | 2 |
| 2023-03 | 125 | 1 |
19. Rank employees by salary within each department
Input (Employee):
| employee_id | department_id | salary |
|-------------|---------------|--------|
| 1 | 101 | 90000 |
| 2 | 101 | 85000 |
| 3 | 102 | 95000 |
| 4 | 101 | 90000 |
| 5 | 102 | 80000 |
Statement:
SELECT employee_id, department_id, salary,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC)
AS salary_rk
FROM Employee;
Output:
| employee_id | department_id | salary | salary_rk |
|-------------|---------------|--------|-----------|
| 1 | 101 | 90000 | 1 |
| 4 | 101 | 90000 | 1 |
| 2 | 101 | 85000 | 3 |
| 3 | 102 | 95000 | 1 |
| 5 | 102 | 80000 | 2 |
20. Find customers who placed orders every month in 2023
Input (Orders):
| order_id | customer_id | order_date |
|----------|-------------|------------|
| 1 | 101 | 2023-01-15 |
| 2 | 101 | 2023-02-20 |
| 3 | 101 | 2023-03-05 |
| ... | ... | ... |
| 13 | 101 | 2023-12-10 |
| 14 | 102 | 2023-01-10 |
| 15 | 102 | 2023-02-15 |
Statement:
SELECT customer_id
FROM Orders
WHERE YEAR(order_date) = 2023
GROUP BY customer_id
HAVING COUNT(DISTINCT FORMAT(order_date,'yyyy-MM')) = 12;
Output:
| customer_id |
|-------------|
| 101 |
21. Find moving average of sales over the last 3 days
Input (Orders):
| order_id | order_date | total_amount |
|----------|------------|--------------|
| 1 | 2023-01-01 | 100 |
| 2 | 2023-01-02 | 150 |
| 3 | 2023-01-03 | 200 |
| 4 | 2023-01-04 | 175 |
| 5 | 2023-01-05 | 125 |
Statement:
SELECT order_date,
AVG(total_amount) OVER (ORDER BY order_date ROWS BETWEEN 2
PRECEDING AND CURRENT ROW) AS moving_avg
FROM Orders;
Output:
| order_date | moving_avg |
|------------|------------|
| 2023-01-01 | 100.00 |
| 2023-01-02 | 125.00 |
| 2023-01-03 | 150.00 |
| 2023-01-04 | 175.00 |
| 2023-01-05 | 166.67 |
22. Identify the first and last order date for each customer
Input (Orders):
| order_id | customer_id | order_date |
|----------|-------------|------------|
| 1 | 101 | 2023-01-15 |
| 2 | 101 | 2023-03-20 |
| 3 | 102 | 2023-02-10 |
| 4 | 101 | 2023-05-05 |
| 5 | 103 | 2023-04-01 |
Statement:
SELECT customer_id,
MIN(order_date) AS first_order,
MAX(order_date) AS last_order
FROM Orders
GROUP BY customer_id;
Output:
| customer_id | first_order | last_order |
|-------------|-------------|-------------|
| 101 | 2023-01-15 | 2023-05-05 |
| 102 | 2023-02-10 | 2023-02-10 |
| 103 | 2023-04-01 | 2023-04-01 |
23. Show product sales distribution (percent of total
revenue)
Input (Sales):
| product_id | quantity | price |
|------------|----------|-------|
| 101 | 2 | 50 |
| 102 | 1 | 100 |
| 101 | 3 | 50 |
| 103 | 5 | 20 |
Statement:
WITH TotalRevenue AS (
SELECT SUM(quantity * price) AS total
FROM Sales
)
SELECT s.product_id,
SUM(s.quantity * s.price) AS revenue,
SUM(s.quantity * s.price) * 100 / t.total AS revenue_pct
FROM Sales s
CROSS JOIN TotalRevenue t
GROUP BY s.product_id, t.total;
Output:
| product_id | revenue | revenue_pct |
|------------|---------|-------------|
| 101 | 250 | 55.56 |
| 102 | 100 | 22.22 |
| 103 | 100 | 22.22 |
24. Retrieve customers who made consecutive purchases (2
Days)
Input (Orders):
| id | order_date |
|----|------------|
| 101| 2023-01-01 |
| 101| 2023-01-02 |
| 101| 2023-01-04 |
| 102| 2023-01-10 |
| 102| 2023-01-11 |
| 103| 2023-01-15 |
Statement:
WITH cte AS (
SELECT id, order_date,
LAG(order_date) OVER (PARTITION BY id ORDER BY order_date)
AS prev_order_date
FROM Orders
)
SELECT id, order_date, prev_order_date
FROM cte
WHERE DATEDIFF(DAY, prev_order_date, order_date) = 1;
Output:
| id | order_date | prev_order_date |
|-----|------------|-----------------|
| 101 | 2023-01-02 | 2023-01-01 |
| 102 | 2023-01-11 | 2023-01-10 |
25. Find churned customers (no orders in the last 6 months)
Input (Orders) - Assuming current date is 2023-07-01:
| customer_id | order_date |
|-------------|------------|
| 101 | 2022-12-15 |
| 101 | 2023-01-20 |
| 102 | 2023-06-15 |
| 103 | 2022-10-10 |
Statement:
SELECT customer_id
FROM Orders
GROUP BY customer_id
HAVING MAX(order_date) < DATEADD(MONTH, -6, GETDATE());
Output:
| customer_id |
|-------------|
| 103 |
26. Calculate cumulative revenue by day
Input (Orders):
| order_date | total_amount |
|------------|--------------|
| 2023-01-01 | 100 |
| 2023-01-02 | 150 |
| 2023-01-03 | 200 |
| 2023-01-05 | 175 |
Statement:
SELECT order_date,
SUM(total_amount) OVER (ORDER BY order_date) AS
cumulative_revenue
FROM Orders;
Output:
| order_date | cumulative_revenue |
|------------|--------------------|
| 2023-01-01 | 100 |
| 2023-01-02 | 250 |
| 2023-01-03 | 450 |
| 2023-01-05 | 625 |
27. Identify top-performing departments by average salary
Input (Employee):
| emp_id | department_id | salary |
|--------|---------------|--------|
| 1 | 101 | 90000 |
| 2 | 101 | 85000 |
| 3 | 102 | 75000 |
| 4 | 102 | 80000 |
| 5 | 103 | 95000 |
Statement:
SELECT department_id,
AVG(salary) AS avg_salary
FROM Employee
GROUP BY department_id
ORDER BY avg_salary DESC;
Output:
| department_id | avg_salary |
|---------------|------------|
| 103 | 95000 |
| 101 | 87500 |
| 102 | 77500 |
28. Find customers who ordered more than the average
number of orders per customer
Input (Orders):
| order_id | customer_id |
|----------|-------------|
| 1 | 101 |
| 2 | 101 |
| 3 | 101 |
| 4 | 102 |
| 5 | 102 |
| 6 | 103 |
Statement:
WITH customer_orders AS (
SELECT customer_id, COUNT(*) AS order_count
FROM Orders
GROUP BY customer_id
)
SELECT * FROM customer_orders
WHERE order_count > (SELECT AVG(order_count) FROM customer_orders);
Output:
| customer_id | order_count |
|-------------|-------------|
| 101 | 3 |
29. Calculate revenue generated from new customers (first-
time orders)
Input (Orders):
| customer_id | order_date | total_amount |
|-------------|------------|--------------|
| 101 | 2023-01-15 | 100 |
| 101 | 2023-02-20 | 150 |
| 102 | 2023-01-10 | 200 |
| 103 | 2023-03-01 | 175 |
Statement:
WITH first_orders AS (
SELECT customer_id, MIN(order_date) AS first_order_date
FROM Orders
GROUP BY customer_id
)
SELECT SUM(o.total_amount) AS new_revenue
FROM Orders o
JOIN first_orders f ON o.customer_id = f.customer_id
WHERE o.order_date = f.first_order_date;
Output:
| new_revenue |
|-------------|
| 475 |
30. Find the percentage of employees in each department
Input (Employee):
| emp_id | department_id |
|--------|---------------|
| 1 | 101 |
| 2 | 101 |
| 3 | 102 |
| 4 | 103 |
| 5 | 103 |
| 6 | 103 |
Statement:
SELECT department_id,
COUNT(*) AS emp_count,
COUNT(*) * 100.0 / (SELECT COUNT(*) FROM Employee) AS pct
FROM Employee
GROUP BY department_id;
Output:
| department_id | emp_count | pct |
|---------------|-----------|-------|
| 101 | 2 | 33.33 |
| 102 | 1 | 16.67 |
| 103 | 3 | 50.00 |
31. Retrieve the maximum salary difference within each
department
Input (Employee):
| emp_id | department_id | salary |
|--------|---------------|--------|
| 1 | 101 | 90000 |
| 2 | 101 | 75000 |
| 3 | 102 | 85000 |
| 4 | 102 | 82000 |
| 5 | 103 | 95000 |
Statement:
SELECT department_id,
MAX(salary) - MIN(salary) AS salary_diff
FROM Employee
GROUP BY department_id;
Output:
| department_id | salary_diff |
|---------------|-------------|
| 101 | 15000 |
| 102 | 3000 |
| 103 | 0 |
32. Find products that contribute to 80% of revenue (Pareto
Principle)
Input (Sales):
| product_id | quantity | price |
|------------|----------|-------|
| 101 | 10 | 100 |
| 102 | 5 | 200 |
| 103 | 20 | 50 |
| 104 | 8 | 75 |
Statement:
WITH sales_cte AS (
SELECT product_id, SUM(quantity * price) AS revenue
FROM Sales GROUP BY product_id
),
total_revenue AS (
SELECT SUM(revenue) AS total FROM sales_cte
)
SELECT s.product_id, s.revenue,
SUM(s.revenue) OVER (ORDER BY s.revenue DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS
running_total
FROM sales_cte s, total_revenue t
WHERE SUM(s.revenue) OVER (ORDER BY s.revenue DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) <= t.total *
0.8;
Output:
| product_id | revenue | running_total |
|------------|---------|---------------|
| 101 | 1000 | 1000 |
| 102 | 1000 | 2000 |
33. Calculate average time between purchases for each
customer
Input (Orders):
| customer_id | order_date |
|-------------|------------|
| 101 | 2023-01-01 |
| 101 | 2023-01-10 |
| 101 | 2023-01-25 |
| 102 | 2023-02-05 |
| 102 | 2023-02-20 |
Statement:
WITH cte AS (
SELECT customer_id, order_date,
LAG(order_date) OVER (PARTITION BY customer_id ORDER BY
order_date) AS prev_date
FROM Orders
)
SELECT customer_id,
AVG(DATEDIFF(DAY, prev_date, order_date)) AS avg_gap_days
FROM cte
WHERE prev_date IS NOT NULL
GROUP BY customer_id;
Output:
| customer_id | avg_gap_days |
|-------------|--------------|
| 101 | 12.0 |
| 102 | 15.0 |
34. Show last purchase for each customer with order amount
Input (Orders):
| customer_id | order_id | total_amount | order_date |
|-------------|----------|--------------|------------|
| 101 | 1001 | 150 | 2023-01-15 |
| 101 | 1002 | 200 | 2023-02-20 |
| 102 | 1003 | 175 | 2023-01-10 |
| 101 | 1004 | 125 | 2023-03-05 |
Statement:
WITH ranked_orders AS (
SELECT customer_id, order_id, total_amount,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY
order_date DESC) AS rn
FROM Orders
)
SELECT customer_id, order_id, total_amount
FROM ranked_orders
WHERE rn = 1;
Output:
| customer_id | order_id | total_amount |
|-------------|----------|--------------|
| 101 | 1004 | 125 |
| 102 | 1003 | 175 |
35. Calculate year-over-year growth in revenue
Input (Orders):
| order_date | total_amount |
|------------|--------------|
| 2021-01-15 | 1000 |
| 2021-02-20 | 1500 |
| 2022-01-10 | 2000 |
| 2022-03-05 | 2500 |
| 2023-02-01 | 3000 |
Statement:
SELECT FORMAT(order_date, 'yyyy') AS year,
SUM(total_amount) AS revenue,
SUM(total_amount) - LAG(SUM(total_amount)) OVER (ORDER BY
FORMAT(order_date, 'yyyy')) AS yoy_growth
FROM Orders
GROUP BY FORMAT(order_date, 'yyyy');
Output:
| year | revenue | yoy_growth |
|------|---------|------------|
| 2021 | 2500 | NULL |
| 2022 | 4500 | 2000 |
| 2023 | 3000 | -1500 |
36. Detect customers with purchases above their 90th
percentile
Input (Orders):
| customer_id | order_id | total_amount |
|-------------|----------|--------------|
| 101 | 1001 | 100 |
| 101 | 1002 | 200 |
| 101 | 1003 | 150 |
| 101 | 1004 | 500 |
| 102 | 1005 | 300 |
Statement:
WITH ranked_orders AS (
SELECT customer_id, order_id, total_amount,
NTILE(10) OVER (PARTITION BY customer_id ORDER BY
total_amount) AS decile
FROM Orders
)
SELECT customer_id, order_id, total_amount
FROM ranked_orders
WHERE decile = 10;
Output:
| customer_id | order_id | total_amount |
|-------------|----------|--------------|
| 101 | 1004 | 500 |
| 102 | 1005 | 300 |
37. Retrieve longest gap between orders for each customer
Input (Orders):
| customer_id | order_date |
|-------------|------------|
| 101 | 2023-01-01 |
| 101 | 2023-01-10 |
| 101 | 2023-02-15 |
| 102 | 2023-01-05 |
| 102 | 2023-03-01 |
Statement:
WITH cte AS (
SELECT customer_id, order_date,
LAG(order_date) OVER (PARTITION BY customer_id ORDER BY
order_date) AS prev_order_date
FROM Orders
)
SELECT customer_id, MAX(DATEDIFF(DAY, prev_order_date, order_date)) AS
max_gap
FROM cte
WHERE prev_order_date IS NOT NULL
GROUP BY customer_id;
Output:
| customer_id | max_gap |
|-------------|---------|
| 101 | 36 |
| 102 | 55 |
38. Identify customers with revenue below 10th percentile
Input (Orders):
| customer_id | total_amount |
|-------------|--------------|
| 101 | 100 |
| 101 | 200 |
| 102 | 50 |
| 103 | 300 |
| 104 | 75 |
Statement:
WITH cte AS (
SELECT customer_id, SUM(total_amount) AS total_revenue
FROM Orders
GROUP BY customer_id
)
SELECT customer_id, total_revenue
FROM cte
WHERE total_revenue < (SELECT PERCENTILE_CONT(0.1) WITHIN GROUP (ORDER
BY total_revenue) FROM cte);
Output:
| customer_id | total_revenue |
|-------------|---------------|
| 102 | 50 |
| 104 | 75 |
39. Find employees with salary above department average
Input (Employee):
| employee_id | department_id | salary |
|-------------|---------------|--------|
| 1 | 101 | 90000 |
| 2 | 101 | 85000 |
| 3 | 102 | 95000 |
| 4 | 101 | 80000 |
| 5 | 102 | 90000 |
Statement:
WITH dept_avg AS (
SELECT department_id, AVG(salary) AS avg_salary
FROM Employee
GROUP BY department_id
)
SELECT e.employee_id, e.department_id, e.salary, d.avg_salary
FROM Employee e
JOIN dept_avg d ON e.department_id = d.department_id
WHERE e.salary > d.avg_salary;
Output:
| employee_id | department_id | salary | avg_salary |
|-------------|---------------|--------|------------|
| 1 | 101 | 90000 | 85000 |
| 3 | 102 | 95000 | 92500 |
40. Find duplicate records in a table
Input (your_table):
| column1 | column2 |
|---------|---------|
| John | Doe |
| Jane | Smith |
| John | Doe |
| Mike | Johnson |
| Jane | Smith |
Statement:
SELECT column1, column2, COUNT(*)
FROM your_table
GROUP BY column1, column2
HAVING COUNT(*) > 1;
Output:
| column1 | column2 | COUNT(*) |
|---------|---------|----------|
| John | Doe | 2 |
| Jane | Smith | 2 |
Conclusion
These questions cover advanced analytical scenarios including statistical calculations
(percentiles), time-based analysis (retention rates), and complex business metrics (Pareto
principle). Each solution demonstrates practical applications of window functions, CTEs, and
advanced joins that are essential for data analysis roles.