The RANK() function is a window function used in SQL to assign a unique rank to rows within a partition of a dataset. This can be particularly useful for sorting and categorizing data based on a specific column or set of columns. The ranks are assigned based on the order specified in the ORDER BY clause, and if two or more rows have the same value, they will receive the same rank, causing gaps in the ranking sequence.
Here’s how to use the RANK() function:
Syntax
RANK() OVER (PARTITION BY column1, column2, ... ORDER BY column3 [ASC|DESC])
- PARTITION BY: Divides the result set into partitions to which the RANK() function is applied. If omitted, the entire result set is treated as a single partition.
- ORDER BY: Specifies the order in which the ranks are assigned within each partition.
Example
Suppose you have a table named Sales with the following columns: Salesperson, Region, and SalesAmount. You want to rank Salespersons based on their SalesAmount within each Region.
SELECT
Salesperson,
Region,
SalesAmount,
RANK() OVER (PARTITION BY Region ORDER BY SalesAmount DESC) AS SalesRank
FROM
Sales;
Explanation:
- The RANK() function assigns ranks based on the SalesAmount within each Region.
- The PARTITION BY Region clause creates a separate ranking for each region.
- The ORDER BY SalesAmount DESC clause ranks rows within each partition, with the highest sales amount getting rank 1.
Result Set Example
For the following Sales data:
| Salesperson | Region | SalesAmount |
|-------------|----------|-------------|
| Alice | North | 500 |
| Bob | North | 400 |
| Carol | North | 400 |
| Dave | South | 300 |
| Eve | South | 200 |
The query would produce:
| Salesperson | Region | SalesAmount | SalesRank |
|-------------|----------|-------------|-----------|
| Alice | North | 500 | 1 |
| Bob | North | 400 | 2 |
| Carol | North | 400 | 2 |
| Dave | South | 300 | 1 |
| Eve | South | 200 | 2 |
Key Points:
- Ranks are assigned such that identical values receive the same rank.
- The next rank after duplicates jumps by the number of duplicates (for instance, if two people tie for rank 2, the next rank will be 4).
- You can also use DENSE_RANK() if you want consecutive ranking without gaps.
Using RANK() effectively can help in various reporting and analytical tasks by allowing you to categorize and evaluate data based on your ranking criteria.