A correlated subquery is a subquery that refers to columns from the outer query. This means that for each row processed by the outer query, the subquery is re-evaluated. The correlated subquery typically accesses a table that is also referenced in the outer query.
Here's a general structure of a correlated subquery:
SELECT column1, column2
FROM outer_table AS outer_query
WHERE column3 = (SELECT MAX(inner_column)
FROM inner_table AS inner_query
WHERE inner_query.foreign_key = outer_query.primary_key);
Steps to Perform a Correlated Subquery:
1. Identify the Tables: Determine which tables you need to use in both the outer query and the subquery.
2. Establish the Relationship: Ensure that the subquery refers back to a column from the outer query. This is crucial for it to be considered a correlated subquery.
3. Write the Outer Query: Start writing your main SQL query using SELECT, FROM, and any necessary JOIN operations, filtering with WHERE as needed.
4. Add the Subquery: Add the subquery as a part of the WHERE, SELECT, or HAVING clause, depending on your requirement. Make sure it references columns from the outer query.
Example of a Correlated Subquery:
Let's assume we have two tables:
1. employees (columns: employee_id, name, salary)
2. departments (columns: department_id, department_name, manager_id)
If you'd like to find all employees whose salary is higher than the average salary of their respective departments, you could use a correlated subquery like this:
SELECT e.employee_id, e.name, e.salary
FROM employees e
WHERE e.salary > (
SELECT AVG(e2.salary)
FROM employees e2
WHERE e2.department_id = e.department_id
);
Explanation of the Example:
- In the outer query, we select employee details from the employees table (e).
- The subquery calculates the average salary for employees in the same department (e2.department_id = e.department_id).
- The outer query retrieves employees whose salary is greater than the average salary calculated by the subquery.
Key Points:
- A correlated subquery is executed for each row processed by the outer query.
- The performance of correlated subqueries can be slower than non-correlated subqueries or joins, especially with large datasets, so it's essential to consider optimization strategies when using them.