The LAG and LEAD functions are window functions in SQL that allow you to access data from previous or subsequent rows in your result set without the need for self-joins. These functions are particularly useful for analytical queries where comparing rows in a dataset is necessary.
Syntax
- LAG:
LAG(column_name, offset, default_value) OVER (PARTITION BY partition_column ORDER BY order_column)
- LEAD:
LEAD(column_name, offset, default_value) OVER (PARTITION BY partition_column ORDER BY order_column)
Parameters
- column_name: The column from which you want to retrieve the previous or next value.
- offset: The number of rows back (for LAG) or forward (for LEAD) to look. The default value is 1 if not specified.
- default_value: An optional value to return if there is no previous or next row (out of bounds).
- PARTITION BY: This clause is used to divide the result set into partitions to which the function is applied.
- ORDER BY: This clause defines the order in which the rows are processed.
Examples
Assume you have a table named sales with the following structure:
| id | sales_amount | sales_date |
|----|--------------|------------|
| 1 | 100 | 2023-01-01 |
| 2 | 150 | 2023-01-02 |
| 3 | 200 | 2023-01-03 |
| 4 | 175 | 2023-01-04 |
| 5 | 250 | 2023-01-05 |
Using LAG
To get the previous day's sales amount:
SELECT
id,
sales_amount,
LAG(sales_amount) OVER (ORDER BY sales_date) AS previous_sales
FROM
sales;
The result would look something like this:
| id | sales_amount | previous_sales |
|----|--------------|----------------|
| 1 | 100 | NULL |
| 2 | 150 | 100 |
| 3 | 200 | 150 |
| 4 | 175 | 200 |
| 5 | 250 | 175 |
Using LEAD
To get the next day's sales amount:
SELECT
id,
sales_amount,
LEAD(sales_amount) OVER (ORDER BY sales_date) AS next_sales
FROM
sales;
The result would look something like this:
| id | sales_amount | next_sales |
|----|--------------|-------------|
| 1 | 100 | 150 |
| 2 | 150 | 200 |
| 3 | 200 | 175 |
| 4 | 175 | 250 |
| 5 | 250 | NULL |
Conclusion
By using LAG and LEAD, you can efficiently analyze trends in your data by comparing current rows with their predecessors or successors, enabling more insightful reporting and analysis.