How do you use the DISTINCT keyword to eliminate duplicate rows in a result set?

Posted by FrankMl

Last Updated: June 02, 2024

DISTINCT duplicates

The DISTINCT keyword in SQL is used to remove duplicate rows from the result set of a query. When you apply DISTINCT, it ensures that the output will contain only unique records based on the columns specified in the SELECT statement. Here’s how you can use the DISTINCT keyword:

Basic Syntax

SELECT DISTINCT column1, column2, ...
FROM table_name;

Example

Assume you have a table named employees with the following data: | employee_id | name | department | |-------------|-----------|------------| | 1 | Alice | HR | | 2 | Bob | IT | | 3 | Alice | HR | | 4 | Charlie | Sales | | 5 | Bob | IT | If you want to get a list of unique employee names, you would write:

SELECT DISTINCT name FROM employees;

Result

The result set would be: | name | |---------| | Alice | | Bob | | Charlie |

Using DISTINCT with Multiple Columns

If you want to retrieve unique combinations of multiple columns (for example, finding unique employees by name and department), you can specify multiple columns like this:

SELECT DISTINCT name, department FROM employees;

Result

The result set might look like: | name | department | |---------|------------| | Alice | HR | | Bob | IT | | Charlie | Sales |

Important Notes

1. Performance: Using DISTINCT can impact performance, especially on large datasets, as it requires the database engine to perform additional work to identify unique rows. 2. NULL Values: The DISTINCT keyword treats NULL values as equal. Therefore, if there are multiple rows with NULL in a specified column, only one NULL will be returned in the result set. 3. Combined with Other Clauses: You can use DISTINCT in combination with other clauses like ORDER BY or WHERE to filter results further before eliminating duplicates. By using DISTINCT, you can effectively ensure that your query returns only unique rows based on the selected columns.

Posted by FrankMl

Basic Syntax

Example

Result

Using DISTINCT with Multiple Columns

Result

Important Notes

Related Content

How do you use the DISTINCT keyword to remove duplicate rows from a result set?

How can you delete duplicate rows from a table?

How do you use the EXCEPT operator to retrieve distinct rows from the left query that are not in the right query?

How do you use the EXCEPT operator to return distinct rows from the left query that are not in the right query?

Distinct Numbers in C++

How do you use the RANK() function to assign ranks to rows within the result set of a query?

How do you use the RANK() function to rank rows within a partition of a result set?

How do you use the ROW_NUMBER function to assign unique sequential integers to rows in the result set?

How do you use cursors to perform calculations that depend on the values of previous rows in the result set?

How do you use the IN keyword to filter results based on a list of values?

How do you use the BETWEEN keyword to filter results within a specified range?

How do you use window functions such as ROW_NUMBER(), RANK(), and DENSE_RANK() to assign ranks to rows in a result set based on specific criteria?

C Program that delete duplicate elements from an array

How do you write a query to find duplicate records in a table?