PostgreSQL Tutorial: PARTITION BY

May 31, 2024

Summary: In this tutorial, you will learn how to use the PostgreSQL PARTITION BY clause to change how the window function calculates the result.

Table of Contents

PostgreSQL PARTITION BY clause overview

The PARTITION BY clause is a subclause of the OVER clause. The PARTITION BY clause divides a query’s result set into partitions. The window function is operated on each partition separately and recalculate for each partition.

The following shows the syntax of the PARTITION BY clause:

window_function ( expression ) OVER (
    PARTITION BY expression1, expression2, ...
    order_clause
    frame_clause
)

You can specify one or more columns or expressions to partition the result set. The expression1, expression1, etc., can only refer to the columns derived by the FROM clause. They cannot refer to expressions or aliases in the select list.

The expressions of the PARTITION BY clause can be column expressions, scalar subquery, or scalar function. Note that a scalar subquery and scalar function always returns a single value.

If you omit the PARTITION BY clause, the whole result set is treated as a single partition.

PARTITION BY vs. GROUP BY

The GROUP BY clause is used often used in conjunction with an aggregate function such as SUM() and AVG(). The GROUP BY clause reduces the number of rows returned by rolling them up and calculating the sums or averages for each group.

For example, the following statement returns the average salary of employees by departments:

SELECT 
    department_id, 
    ROUND(AVG(salary)) avg_department_salary
FROM
    employees
GROUP BY 
    department_id
ORDER BY
    department_id;

The following picture shows the result:

SQL Partition By - group by clause

The PARTITION BY clause divides the result set into partitions and changes how the window function is calculated. The PARTITION BY clause does not reduce the number of rows returned.

The following statement returns the employee’s salary and also the average salary of the employee’s department:

SELECT 
    first_name,
    last_name,
    department_id, 
    ROUND(AVG(salary) OVER (
        PARTITION BY department_id
    )) avg_department_salary
FROM
    employees;

Here is the partial output:

sql partition by - window function AVG example

In simple words, the GROUP BY clause is aggregate while the PARTITION BY clause is analytic.

Summary

In this tutorial, you have learned about the PostgreSQL PARTITION BY clause that changes how the window function’s result is calculated.