In SQL, a subquery is a query nested inside another query, which can be used to filter data in the main query. Here's how to write a subquery to filter data:
Basic Structure
The basic structure of using a subquery to filter data involves two parts: the inner query (subquery) and the outer query (main query). The subquery typically returns a set of values that the main query then uses for filtering.
Syntax
Here is a general syntax for a subquery used in a WHERE clause:
SELECT column1, column2, ...
FROM table_name
WHERE column_name [operator] (SELECT column_name FROM another_table WHERE condition);
Example
Imagine you have two tables, employees and departments. You want to find all employees that belong to a department with a specific name.
Suppose the departments table has the columns department_id and department_name, and the employees table has the columns employee_id, employee_name, and department_id.
Query
You could use a subquery like so:
SELECT employee_id, employee_name
FROM employees
WHERE department_id IN (SELECT department_id FROM departments WHERE department_name = 'Sales');
Explanation
- The inner query (SELECT department_id FROM departments WHERE department_name = 'Sales') retrieves the department_id for departments named "Sales".
- The outer query then selects employee_id and employee_name from the employees table where the department_id obtained from the inner query matches.
Other Types of Subqueries
You can use subqueries in various parts of a SQL statement, including:
1. SELECT Clause:
SELECT employee_name,
(SELECT COUNT(*) FROM sales WHERE sales.employee_id = employees.employee_id) AS sales_count
FROM employees;
2. FROM Clause:
SELECT e.employee_name
FROM employees e
JOIN (SELECT department_id FROM departments WHERE department_name = 'Sales') d
ON e.department_id = d.department_id;
3. DELETE / UPDATE Clauses:
DELETE FROM employees
WHERE department_id IN (SELECT department_id FROM departments WHERE department_name = 'Outsourced');
Considerations
- Performance: Subqueries may impact performance, especially if the inner query returns a large dataset. In such cases, consider using joins or common table expressions (CTEs).
- Correlated Subqueries: These are subqueries that reference columns from the outer query. They can be used when each row of the outer query needs to evaluate conditions from the inner query.
Following these principles, you can effectively use subqueries to filter data in your SQL queries.