The LEAD() and LAG() functions in SQL are window functions that allow you to access data from subsequent or previous rows within the same result set without needing to perform a self-join. These functions are particularly useful for comparing values between rows in a query.
Basic Syntax
- LAG(value_expression [, offset [, default]]) OVER (PARTITION BY ... ORDER BY ...)
- LEAD(value_expression [, offset [, default]]) OVER (PARTITION BY ... ORDER BY ...)
- value_expression: The column you want to look at.
- offset: The number of rows forward or backward from the current row. (default is 1)
- default: The value returned if the offset row does not exist. (default is NULL)
- PARTITION BY: This clause divides the query result set into partitions to which the function is applied.
- ORDER BY: This clause defines the order in which the rows are evaluated.
Example Usage
Scenario
Assume we have a table named sales with the following columns: sale_date, amount, and store_id. You want to compare the amount of sales for each store between each month.
CREATE TABLE sales (
sale_date DATE,
amount DECIMAL,
store_id INT
);
Using LAG
To calculate the change in sales amount from one month to the previous month for each store, you can use LAG() as follows:
SELECT
sale_date,
amount,
LAG(amount) OVER (PARTITION BY store_id ORDER BY sale_date) AS previous_amount,
amount - LAG(amount) OVER (PARTITION BY store_id ORDER BY sale_date) AS amount_change
FROM
sales;
Explanation:
- LAG(amount) OVER (PARTITION BY store_id ORDER BY sale_date) retrieves the amount from the previous row (previous sale for that store).
- amount - LAG(amount) calculates the difference in sales between the current sale and the previous sale, showing how sales have changed.
Using LEAD
Similarly, you can use LEAD() to compare the current row to the next row:
SELECT
sale_date,
amount,
LEAD(amount) OVER (PARTITION BY store_id ORDER BY sale_date) AS next_amount,
LEAD(amount) OVER (PARTITION BY store_id ORDER BY sale_date) - amount AS amount_change_forward
FROM
sales;
Explanation:
- LEAD(amount) OVER (PARTITION BY store_id ORDER BY sale_date) retrieves the amount from the next row (next sale for that store).
- The calculation LEAD(amount) - amount shows the difference in sales from the current sale to the next sale.
Conclusion
Using LEAD() and LAG() helps you easily compare values between rows in your dataset, making it possible to perform complex analyses and track changes over time without convoluted self-joins or nested queries.