How do you perform a correlated subquery?
Posted by CarolTh
Last Updated: August 03, 2024
A correlated subquery is a subquery that refers to columns from the outer query. This means that for each row processed by the outer query, the subquery is re-evaluated. The correlated subquery typically accesses a table that is also referenced in the outer query. Here's a general structure of a correlated subquery:
SELECT column1, column2
FROM outer_table AS outer_query
WHERE column3 = (SELECT MAX(inner_column)
                 FROM inner_table AS inner_query
                 WHERE inner_query.foreign_key = outer_query.primary_key);
Steps to Perform a Correlated Subquery:
1. Identify the Tables: Determine which tables you need to use in both the outer query and the subquery. 2. Establish the Relationship: Ensure that the subquery refers back to a column from the outer query. This is crucial for it to be considered a correlated subquery. 3. Write the Outer Query: Start writing your main SQL query using SELECT, FROM, and any necessary JOIN operations, filtering with WHERE as needed. 4. Add the Subquery: Add the subquery as a part of the WHERE, SELECT, or HAVING clause, depending on your requirement. Make sure it references columns from the outer query.
Example of a Correlated Subquery:
Let's assume we have two tables: 1. employees (columns: employee_id, name, salary) 2. departments (columns: department_id, department_name, manager_id) If you'd like to find all employees whose salary is higher than the average salary of their respective departments, you could use a correlated subquery like this:
SELECT e.employee_id, e.name, e.salary
FROM employees e
WHERE e.salary > (
    SELECT AVG(e2.salary)
    FROM employees e2
    WHERE e2.department_id = e.department_id
);
Explanation of the Example:
- In the outer query, we select employee details from the employees table (e). - The subquery calculates the average salary for employees in the same department (e2.department_id = e.department_id). - The outer query retrieves employees whose salary is greater than the average salary calculated by the subquery.
Key Points:
- A correlated subquery is executed for each row processed by the outer query. - The performance of correlated subqueries can be slower than non-correlated subqueries or joins, especially with large datasets, so it's essential to consider optimization strategies when using them.