The FIRST_VALUE function is a window function in SQL that allows you to retrieve the first value in a set of ordered values within a specified window or partition. It is particularly useful when you're working with analytical queries that require aggregation over a subset of data.
Syntax:
FIRST_VALUE(expression) OVER (
[PARTITION BY partition_expression]
ORDER BY order_expression
[ROWS BETWEEN ...]
)
Parameters:
- expression: The column or expression from which you want to retrieve the first value.
- PARTITION BY: (Optional) This clause divides the result set into partitions to which the function is applied. If omitted, the function treats the entire result set as a single partition.
- ORDER BY: This clause defines the logical order in which the rows are considered in each partition.
- ROWS BETWEEN: (Optional) This clause can be used to specify the range of rows to consider for the function if you want it to be limited to a specific set of rows rather than the entire partition.
Example Usage:
Assume you have a table named sales with the following schema:
| id | employee_id | sale_date | amount |
|----|-------------|-------------|--------|
| 1 | 101 | 2023-01-01 | 100 |
| 2 | 101 | 2023-01-05 | 150 |
| 3 | 102 | 2023-01-02 | 200 |
| 4 | 102 | 2023-01-06 | 250 |
To get the first sale amount for each employee based on sale_date, you can use the FIRST_VALUE function like this:
SELECT
employee_id,
sale_date,
amount,
FIRST_VALUE(amount) OVER (
PARTITION BY employee_id
ORDER BY sale_date
) AS first_sale_amount
FROM
sales
ORDER BY
employee_id, sale_date;
Explanation:
- The PARTITION BY employee_id clause creates a separate partition for each employee.
- The ORDER BY sale_date clause orders the sales within each employee's partition by the sale date.
- The FIRST_VALUE(amount) retrieves the first sale amount for each employee based on the order defined in the ORDER BY clause.
- The result set will show each sale along with the first sale amount for the respective employee for easy analysis.
Output Example:
Given the above SQL query, the output would look like this:
| employee_id | sale_date | amount | first_sale_amount |
|-------------|-------------|--------|--------------------|
| 101 | 2023-01-01 | 100 | 100 |
| 101 | 2023-01-05 | 150 | 100 |
| 102 | 2023-01-02 | 200 | 200 |
| 102 | 2023-01-06 | 250 | 200 |
Here, each row shows the sale amount along with the respective first sale amount for that employee.