In SQL, the OVER clause is used with ranking functions to compute a ranking value for each row in the result set based on a specified order and, optionally, a partitioning scheme. Ranking functions fall under a group of functions that provide a unique rank or row number to each row within a partition of a result set.
The most commonly used ranking functions that utilize the OVER clause are:
1. ROW_NUMBER(): Assigns a unique sequential integer to each row within a partition, starting at 1.
2. RANK(): Assigns a rank to each distinct row within a partition, with gaps in ranking for ties.
3. DENSE_RANK(): Similar to RANK(), but without gaps in the ranking for ties.
4. NTILE(n): Divides the result set into n number of roughly equal parts and assigns a bucket number to each row.
General Syntax
The general syntax for using ranking functions with the OVER clause is as follows:
RANK_FUNCTION() OVER (
[PARTITION BY column1, column2, ...]
ORDER BY column3 [ASC|DESC], column4 [ASC|DESC], ...
)
Example
Here's an example using a hypothetical employees table with columns id, name, department, and salary.
Example: Using ROW_NUMBER()
SELECT
id,
name,
department,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM
employees;
In this example:
- The ROW_NUMBER() function is used to assign a unique row number for each employee in each department.
- The PARTITION BY department clause means that the row numbering restarts for each department.
- The ORDER BY salary DESC clause sorts the employees in each department by their salary in descending order.
Example: Using RANK()
SELECT
id,
name,
department,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM
employees;
In this case:
- The RANK() function is used to assign a rank based on salary within each department. If two employees have the same salary, they will receive the same rank, and the next rank will skip accordingly.
Example: Using DENSE_RANK()
SELECT
id,
name,
department,
salary,
DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank
FROM
employees;
In this example:
- The DENSE_RANK() function ranks employees, handling ties similarly to RANK(), but without gaps in the subsequent ranks for tied salaries.
Example: Using NTILE(n)
SELECT
id,
name,
department,
salary,
NTILE(4) OVER (PARTITION BY department ORDER BY salary DESC) AS quartile
FROM
employees;
Here:
- The NTILE(4) function divides the result set into 4 parts (quartiles) based on the salary within each department.
Summary
Using the OVER clause with ranking functions allows you to effectively analyze and rank data in SQL based on specified criteria. Remember to choose the appropriate ranking function based on how you want to handle ties and the structure of your dataset.