The DISTINCT keyword is used in SQL to eliminate duplicate rows from the result set of a query. When you include DISTINCT in your SELECT statement, the database will return only unique rows based on the columns you specify.
Here's a basic syntax and an example to demonstrate how to use DISTINCT:
Syntax
SELECT DISTINCT column1, column2, ...
FROM table_name
WHERE condition;
Example
Suppose you have a table called employees with the following columns: id, name, and department. The employees table contains some duplicate entries in the department column.
Sample Data
| id | name | department |
|----|----------|------------|
| 1 | Alice | HR |
| 2 | Bob | IT |
| 3 | Charlie | IT |
| 4 | David | HR |
| 5 | Emma | Marketing |
If you want to get a list of unique departments from this table, you can use the following query:
SELECT DISTINCT department
FROM employees;
Result
The result set will look like this:
| department |
|------------|
| HR |
| IT |
| Marketing |
In this case, you get just the unique departments without duplicates, even though "HR" appears twice in the original table.
Notes
1. Multiple Columns: If you specify multiple columns in your SELECT DISTINCT statement, the combination of those columns must be unique across the result set. For example:
SELECT DISTINCT department, name
FROM employees;
This would return a result set that includes unique combinations of department and name.
2. Performance: Using DISTINCT can affect performance, especially on large datasets, as it requires the database to compare rows to determine uniqueness.
3. NULL Values: DISTINCT treats NULL values as equal, meaning it will only return one row for each unique null occurrence in the output.
By using the DISTINCT keyword, you can effectively filter out duplicate records and display only the unique entries in your queries.