PostgreSQL Tutorial: Enable Atomic Writes at Storage Layer

May 20, 2026

Summary: In this tutorial, you will learn how to enable atomic writes at the storage layer and disable full_page_writes to boost OLTP processing performance of PostgreSQL.

Table of Contents

In traditional PostgreSQL architectures, ensuring 8KB page atomic writes at the storage layer is the only technical prerequisite to safely turn off full_page_writes. The core logic of this solution is straightforward: if the underlying storage can guarantee that every 8KB data page write operation of PostgreSQL is an atomic operation — either fully completed or completely failed — the torn-page failure issue that FPW was originally designed to resolve will be fundamentally eliminated.

Technical Essence of Atomic Writes & Its Substitution for FPW

Root Cause of Torn Pages

PostgreSQL adopts 8KB data pages by default, while modern storage stacks are layered as follows:

  • Application layer: PostgreSQL 8KB pages
  • Filesystem layer: typically 4KB pages under x86 architecture
  • Block device layer: 512-byte sectors for traditional HDDs, 4KB physical sectors for modern SSDs

When PostgreSQL performs an 8KB page write, the operating system splits it into two 4KB filesystem write requests, which are further divided into multiple disk sector writes. If a power outage or system crash occurs during the write process, only partial data may be persisted to disk, resulting in half-page corruption.

How Atomic Writes Replace FPW

FPW works by storing full page copies in WAL to serve as a safe baseline for crash recovery. With atomic write guarantees from the storage layer:

Any 8KB page write will either write the entire 8KB data to disk successfully, or write nothing at all, keeping the original intact page on disk.

In the event of a crash, pages on disk remain either fully original or fully updated, with no intermediate torn-page states possible. Crash recovery can directly replay incremental modifications in WAL without relying on full page copies provided by FPW.

Detailed Introduction to Mainstream 8KB Atomic Write Storage Solutions

Solution 1: Hardware RAID Controller with BBU (Battery Backup Unit)

This is the most mature and reliable solution for enterprise environments, and also the most widely adopted choice among traditional DBAs.

Working Mechanism

  • The RAID controller is equipped with an independent Battery Backup Unit (BBU) or Flash Backup Unit (FBU).
  • When the OS sends an 8KB write request, data is first written to the controller’s built-in high-speed cache.
  • Once data enters the battery-protected cache, the controller returns a write-completed acknowledgment to the operating system.
  • The controller flushes cached data to physical disks in batches atomically in the background.
  • Even during power failures, BBU/FBU can power the cache for hours to ensure all data is ultimately persisted.

Key Configuration Requirements

  1. Enable Write-Back Mode Mandatorily

    • Disable Write-Through mode: In write-through mode, data is written directly to disks bypassing cache, which cannot guarantee atomicity.
    • Check RAID card cache policy: MegaCli -LDGetProp -Cache -LAll -aAll
    • Set write-back mode: MegaCli -LDSetProp WB -LAll -aAll
  2. Ensure Healthy Status of BBU/FBU

    • Check battery status regularly: MegaCli -AdpBbuCmd -GetBbuStatus -aAll
    • When battery capacity drops below the threshold or requires a learn cycle, the RAID controller automatically downgrades to write-through mode, invalidating atomic write guarantees.
    • Replace BBU batteries every 3 to 5 years as recommended.
  3. Disable Physical Disk Internal Cache

    • Disk write caches are usually unprotected by batteries and prone to data loss after power cuts.
    • Configuration command: MegaCli -LDSetProp -DisDskCache -LAll -aAll

Verification Method

# Simulate 8KB aligned writes with dd and test with repeated power cuts
dd if=/dev/zero of=/var/lib/pgsql/16/data/testfile bs=8k count=100000 oflag=direct,sync

# Verify file integrity after reboot
md5sum /var/lib/pgsql/16/data/testfile

Pros & Cons

  • Advantages: Excellent performance, broad compatibility across all OS and filesystems, enterprise-grade reliability.
  • Disadvantages: High hardware cost, dependent on BBU battery lifespan, requires regular maintenance.

Solution 2: Atomic Write Commands for NVMe SSDs

Modern NVMe SSDs natively support atomic write commands, enabling atomicity guarantees without additional RAID controllers.

Working Mechanism

  • NVMe 1.1 and above specifications define the Atomic Write Unit (AWU) parameter.
  • Most consumer-grade NVMe SSDs feature a 16KB AWU, while enterprise-grade models commonly support larger AWU such as 64KB and 128KB.
  • SSD firmware ensures write atomicity as long as write requests are aligned and do not exceed the AWU limit.
  • For PostgreSQL 8KB pages, an SSD AWU of 8KB or higher fully meets requirements.

Key Configuration Requirements

  1. Confirm SSD Atomic Write Capability
# Query NVMe device information
nvme id-ns /dev/nvme0n1 -H | grep "Atomic Write Unit"
  • The displayed Atomic Write Unit value shall be no less than 8KB (16 × 512-byte sectors).
  1. Enable Filesystem Direct I/O
  • PostgreSQL opens data files with the O_DIRECT flag by default to bypass OS page cache.
  • Ensure direct I/O is not disabled in filesystem mount parameters.
  1. Disable Volatile SSD Write Cache
  • Most enterprise-grade NVMe SSDs disable volatile write cache by default.
  • Inspection command: nvme get-feature /dev/nvme0n1 -f 0x06
  • If write cache is enabled, confirm the SSD is fitted with Power Loss Protection (PLP) capacitors.

Verification Method

Use the fio tool to validate atomic writes:

fio --name=atomic-write-test --filename=/var/lib/pgsql/testfile --rw=write --bs=8k --size=10G --ioengine=libaio --direct=1 --iodepth=32 --runtime=300 --time_based

Perform repeated power cuts during the test, then verify filesystem and data integrity after reboot.

Pros & Cons

  • Advantages: No RAID controller required, lower latency, better performance and lower power consumption.
  • Disadvantages: Reliant on SSD firmware implementation with inconsistent quality across vendors; consumer-grade SSDs may lack PLP support.

Solution 3: Copy-on-Write (CoW) Filesystems (ZFS, Btrfs)

The inherent write mechanism of CoW filesystems guarantees atomicity, making them the optimal software-based atomic write implementation.

Working Mechanism

  • CoW filesystems never overwrite data in-place; new data is written to free disk space instead.
  • Metadata pointers are updated to point to new data blocks only after new data is fully persisted.
  • The entire process is atomic, so only complete old or new data remains after crashes with no intermediate states.
  • Both ZFS and Btrfs ensure atomicity for PostgreSQL 8KB page writes regardless of underlying disk sector sizes.

Key ZFS Configuration Requirements

  1. Create PostgreSQL-optimized ZFS Pool
zpool create -o ashift=12 tank /dev/nvme0n1
# ashift=12 sets 4KB block size matching physical sectors of modern SSDs
  1. Create PostgreSQL Dataset
zfs create -o recordsize=8k -o compression=lz4 -o atime=off -o logbias=throughput tank/pgdata
  • recordsize=8k: Matches PostgreSQL page size to eliminate write amplification.
  • compression=lz4: Enables LZ4 compression with negligible performance overhead and improved throughput.
  • atime=off: Disables access time updates to reduce redundant writes.
  • logbias=throughput: Optimizes performance for large write operations.
  1. Configure ZFS Intent Log (ZIL)
  • Deploy dedicated high-speed SSDs as separate ZIL devices for write-heavy workloads.
  • Command: zfs add tank log /dev/nvme1n1

Key Btrfs Configuration Requirements

# Mount parameters
mount -o noatime,nodiratime,compress=lz4,space_cache=v2 /dev/nvme0n1 /var/lib/pgsql
  • Avoid nodatacow: This parameter disables copy-on-write and invalidates atomic write guarantees.
  • compress=lz4: Activate LZ4 compression.
  • space_cache=v2: Adopt efficient space caching mechanism.

Pros & Cons

  • Advantages: Pure software implementation with no special hardware demands; built-in checksum, snapshot and compression features.
  • Disadvantages: Moderate performance overhead; poorer Linux compatibility of ZFS compared with ext4; stability concerns of Btrfs in certain scenarios.

Solution 4: Enterprise-Grade SAN Storage

Enterprise SAN storage arrays (e.g. EMC, NetApp, HPE) provide array-level atomic write guarantees.

Working Mechanism

  • SAN controllers are equipped with large-capacity battery-backed cache.
  • All write requests enter cache first before being flushed to disks in batches by controllers.
  • Controllers ensure atomicity for all writes equal to or smaller than the array block size (commonly 8KB or 16KB).

Key Configuration Requirements

  • Set the SAN array block size to 8KB or larger.
  • Enable controller write-back cache mode.
  • Maintain normal operation of controller battery backup units.
  • Deploy multipath software such as DM-Multipath for path redundancy.

Solution Comparison & Selection Guidelines

Solution Atomicity Guarantee Performance Cost OPS Complexity Applicable Scenarios
Hardware RAID with BBU Extremely High High High Medium Traditional enterprise data centers with strict reliability requirements
NVMe SSD with PLP High Extremely High Medium Low Modern servers & high-performance OLTP workloads
ZFS Filesystem Extremely High Medium-High Low Medium General scenarios requiring snapshots, compression and other advanced features
Btrfs Filesystem Medium-High Medium-High Low Low Testing environments & non-core businesses
Enterprise SAN Extremely High Medium Extremely High High Large enterprises with existing SAN infrastructure

Selection Priority

  1. Core Business: Prioritize hardware RAID with BBU or ZFS filesystem.
  2. High-Performance Workloads: Choose enterprise-grade PLP-enabled NVMe SSDs.
  3. Cost-Sensitive Scenarios: Adopt ZFS to leverage software-native advanced features.
  4. Existing SAN Infrastructure: Utilize built-in atomic write capabilities of SAN storage directly.

Risk Assessment & Limitations

Even with 8KB atomic write guarantees at the storage layer, disabling FPW still comes with inherent risks and limitations.

Unresolvable Issues

Silent Data Corruption: Atomic writes only ensure write integrity and cannot fix silent corruption caused by disk media damage or firmware bugs.

  • Enable PostgreSQL data checksums: initdb --data-checksums
  • Run pg_checksums periodically for full database integrity inspection.

Extreme Failure Scenarios: Metadata write failures may still trigger filesystem-level corruption.

  • Adopt journaled filesystems including ext4, XFS, ZFS and Btrfs.
  • Schedule regular filesystem consistency checks.

Concurrent Primary & Standby Outage: Local crash recovery is still required if both primary and standby databases go offline simultaneously.

  • Atomic writes eliminate torn pages, yet other types of data corruption remain possible.
  • Ensure full reliability and recoverability of backup solutions.

Common Pitfalls

BBU Battery Degradation: Insufficient BBU power triggers automatic fallback to write-through mode and breaks atomic write guarantees.

  • Establish dedicated monitoring and alerting for BBU health status.
  • Arrange periodic battery calibration and replacement.

SSD Firmware Defects: Flawed atomic write implementations in certain SSD firmware versions may lead to data corruption.

  • Select SSD models fully validated by the PostgreSQL community.
  • Keep SSD firmware updated to the latest stable release timely.

Incorrect Filesystem Configuration: Parameters like nodatacow on Btrfs disable CoW and break atomicity.

  • Follow official recommended configuration standards strictly.
  • Regularly verify filesystem mount parameters.

Complete Implementation Plan

Pre-Deployment Preparation

Storage Selection & Testing

  • Select appropriate storage solutions based on business demands.
  • Conduct pressure testing and fault injection tests to verify atomic write effectiveness.
  • Evaluate performance metrics under diverse configuration combinations.

Database Preparation

  • Enable data checksums: Use initdb --data-checksums for new instances; apply pg_checksums --enable for existing databases.
  • Upgrade PostgreSQL to the latest stable version.
  • Deploy comprehensive monitoring systems.

Rollout Procedures

Validation in Test Environments

  • Replicate identical storage and database configurations matching production environments.
  • Execute business pressure tests and simulate various failure scenarios including power cuts, server restarts and disk failures.
  • Maintain stable operation observation for at least one month to confirm zero data corruption.

Phased Production Rollout

  1. Deploy on non-core business systems first with 1-2 weeks of observation.
  2. Implement the configuration on standby nodes of core databases for another 1-2 weeks of monitoring.
  3. Finally apply changes to primary core database nodes.

Parameter Configuration

# postgresql.conf
full_page_writes = off
wal_log_hints = on  # Enable wal_log_hints for page repair even after FPW is disabled

Daily OPS & Monitoring

Storage Layer Monitoring

  • Monitor RAID controller status and BBU battery health.
  • Track SSD health metrics, wear level and operating temperature.
  • Oversee filesystem usage, inode consumption and disk error logs.

Database Layer Monitoring

  • Track checksum failure errors via the checksum_failures field in pg_stat_database.
  • Monitor WAL generation rate, which shall drop significantly after FPW disablement.
  • Observe transaction throughput and latency to confirm expected performance gains.

Periodic Maintenance Tasks

  • Run full-database integrity checks via pg_checksums weekly.
  • Conduct backup restoration drills monthly.
  • Execute fault injection tests quarterly.

Conclusion

Disabling FPW by leveraging storage-layer 8KB atomic writes is a technically feasible solution with strict prerequisites for traditional PostgreSQL architectures. It delivers substantial performance improvements especially for write-heavy workloads, while imposing higher requirements on underlying storage infrastructure and operational capabilities.

Core Conclusions

  1. Hardware RAID with BBU and ZFS filesystem are the two most reliable solutions proven by long-term production practice.
  2. Data checksums must be enabled alongside complete backup and high availability architectures even with atomic write support.
  3. Disabling full_page_writes is a high-risk operation that requires sufficient verification and gradual phased deployment.

See more

PostgreSQL Optimization