How can you use UNION ALL to combine results from multiple queries including duplicates?
Posted by JackBrn
Last Updated: August 05, 2024
In SQL, the UNION ALL operator is used to combine the results of two or more SELECT statements into a single result set, including duplicates. This means that if the same row appears in both queries, it will be included in the final result multiple times, reflecting the actual occurrences in the original datasets.
Here’s how it works:
1. Basic Syntax: The syntax for using UNION ALL is straightforward. You simply write multiple SELECT statements, and separate them with the UNION ALL keyword.
SELECT column1, column2
   FROM table1 
   WHERE condition1
   UNION ALL
   SELECT column1, column2
   FROM table2 
   WHERE condition2;
2. Column Matching: The number of columns and the data types in each SELECT statement must match. If SELECT from both queries differs in column count or types, it will result in an error. 3. Example: Suppose you have two tables: employees and contractors, both with similar columns, and you want to retrieve all names from both tables.
SELECT name
   FROM employees 
   UNION ALL
   SELECT name
   FROM contractors;
This query retrieves all names from the employees table, followed by all names from the contractors table, including any duplicates (for instance, if a contractor is also listed as an employee).
When to Use UNION ALL:
- Performance: UNION ALL is generally faster than UNION because it does not attempt to remove duplicates. If you are confident that you want all results, including duplicates, UNION ALL is usually the preferred option. - Preserving Data: If your analysis requires keeping all instances of data for further calculations or reporting, UNION ALL is essential.
Example with Different Conditions:
You can further specify conditions to filter the results as needed:
SELECT id, name, 'Employee' AS type
FROM employees
WHERE department = 'Sales'
UNION ALL
SELECT id, name, 'Contractor' AS type
FROM contractors
WHERE start_date >= '2023-01-01';
In the above query, we select IDs and names from both employees and contractors, along with a type identifier, while still including duplicates.
Conclusion:
Using UNION ALL is a practical method to combine results from multiple SELECT statements when the preservation of duplicate data is important for your needs. Always ensure that the columns being selected match in number and data type to avoid errors.