May 31, 2024
Summary: In this tutorial, you will learn how to use the PostgreSQL PARTITION BY
clause to change how the window function calculates the result.
Table of Contents
PostgreSQL PARTITION BY clause overview
The PARTITION BY
clause is a subclause of the OVER
clause. The PARTITION BY
clause divides a query’s result set into partitions. The window function is operated on each partition separately and recalculate for each partition.
The following shows the syntax of the PARTITION BY
clause:
window_function ( expression ) OVER (
PARTITION BY expression1, expression2, ...
order_clause
frame_clause
)
You can specify one or more columns or expressions to partition the result set. The expression1
, expression1
, etc., can only refer to the columns derived by the FROM
clause. They cannot refer to expressions or aliases in the select list.
The expressions of the PARTITION BY
clause can be column expressions, scalar subquery, or scalar function. Note that a scalar subquery and scalar function always returns a single value.
If you omit the PARTITION BY
clause, the whole result set is treated as a single partition.
PARTITION BY vs. GROUP BY
The GROUP BY
clause is used often used in conjunction with an aggregate function such as SUM()
and AVG()
. The GROUP BY
clause reduces the number of rows returned by rolling them up and calculating the sums or averages for each group.
For example, the following statement returns the average salary of employees by departments:
SELECT
department_id,
ROUND(AVG(salary)) avg_department_salary
FROM
employees
GROUP BY
department_id
ORDER BY
department_id;
The following picture shows the result:
The PARTITION BY
clause divides the result set into partitions and changes how the window function is calculated. The PARTITION BY
clause does not reduce the number of rows returned.
The following statement returns the employee’s salary and also the average salary of the employee’s department:
SELECT
first_name,
last_name,
department_id,
ROUND(AVG(salary) OVER (
PARTITION BY department_id
)) avg_department_salary
FROM
employees;
Here is the partial output:
In simple words, the GROUP BY
clause is aggregate while the PARTITION BY
clause is analytic.
Summary
In this tutorial, you have learned about the PostgreSQL PARTITION BY
clause that changes how the window function’s result is calculated.