Advantages of PostgreSQL Native Storage Engine

By John Doe November 17, 2025

Summary: In this article, you will learn about the advantages of PostgreSQL’s native storage engine.

Table of Contents

an elephant and a dolphin

Ultimate Performance and Efficiency

The query optimizer can formulate optimal execution plans based on precise characteristics of the storage engine—such as data distribution, index types, and page layout. For example, the optimizer knows exactly how a particular index is organized and stored, enabling highly accurate cost estimation. In contrast, under a pluggable architecture, the optimizer must rely on generic assumptions, potentially failing to fully leverage the capabilities of any specific storage engine.

The cost model of the query optimizer can be deeply integrated with the native storage engine’s characteristics. The optimizer has precise knowledge of the real costs associated with operations like sequential reads, random reads, and index lookups. PostgreSQL collects and utilizes extremely detailed statistics—including per-column histograms, lists of most common values, and correlation statistics between columns—allowing it to estimate selectivity for multi-condition queries far more accurately than systems using pluggable storage engines.

Moreover, the query executor’s execution plans are tightly coupled with the behavioral traits of the native storage engine, enabling the entire database system to achieve peak performance. Examples include heap-table–specific plans such as Hash Join and Bitmap Scan.

High Stability and Reliability

Unified WAL Logging: In the native architecture, there is only one unified redo log (Write-Ahead Log, or WAL) and undo mechanism designed for the entire database system. Both the transaction manager and the storage engine share this single logging system to guarantee crash recovery and ensure atomicity and durability of transactions. This avoids the complexities inherent in pluggable architectures—for instance, coordinating transactions across multiple storage engines (e.g., XA transactions), which introduces both performance overhead and operational complexity. Additionally, logical replication (publication/subscription) built atop the WAL becomes more efficient and reliable.

Simplified Crash Recovery: Because logging and storage are unified, crash recovery is deterministic and straightforward. Upon restart, the database simply replays the unified WAL to restore data consistency. In a pluggable architecture, if multiple engines participate in the same transaction, recovery becomes significantly more complex. Since WAL replay is strictly sequential, an error during replay by any pluggable storage engine could bring the entire database system to a halt.

Unified Backup and Restore: Native backup tools interact directly with the native storage engine, enabling the creation of physically or logically consistent snapshots. In contrast, pluggable architectures often require different backup strategies or tools for different engines, increasing operational complexity.

Because the native storage engine forms a complete, thoroughly tested, and cohesive system, its architectural consistency is higher, and the likelihood of edge-case bugs or hard-to-debug issues is comparatively lower. Over many years, all components have been refined together, resulting in exceptional stability and maturity.

Limitations of MySQL’s Pluggable Architecture

Inherent Weaknesses in Query Optimization

Pluggable architectures introduce a well-defined API interface (e.g., MySQL’s handler class) between the optimizer and storage engines. While this wall provides flexibility, it also leads to loss of critical information and incurs performance overhead.

The optimizer’s core decisions—such as which index to use or what join order to follow—rely heavily on statistical information. However, the optimizer can only request statistics through the storage engine’s API. To maintain generality, this API is typically simple and coarse-grained. For example:

  • records_in_range(): Estimates the number of records within an index range by querying the storage engine. For B+Tree-based engines like InnoDB, this is often done via sampling non-leaf nodes, leading to potentially large estimation errors.
  • info(): Retrieves rough table-level statistics, such as total row count—which itself may be an approximation.

Consequently, the MySQL optimizer frequently makes decisions based on incomplete or inaccurate information. While this has minimal impact on simple queries (e.g., primary key lookups or basic single-table filters), estimation errors compound dramatically in complex scenarios involving multi-table joins or intricate filter conditions, often resulting in severely suboptimal execution plans.

Additionally, the optimizer’s cost model translates various operations (disk I/O, CPU computation) into comparable costs. MySQL’s cost model is relatively generic and hard-coded, attempting to establish a universal standard applicable to all storage engines. Yet engines like InnoDB and MyISAM exhibit vastly different performance characteristics. As a result, the cost model often fails to reflect the true behavior—even of InnoDB, MySQL’s default engine.

For instance, the optimizer might conclude that “using Index A followed by sorting” is cheaper than “using Index B”, when in reality, the former could be significantly slower on InnoDB—simply because the optimizer lacks awareness of InnoDB-specific mechanisms like clustered indexes or the Change Buffer.

It should be noted that the MySQL team is well aware of these limitations and has been actively working to “tear down this wall”:

  • InnoDB as the de facto standard: With MyISAM’s decline, MySQL migrated system tables to InnoDB, effectively turning MySQL into a database “natively optimized for InnoDB,” enabling deeper integration.
  • Ongoing cost model improvements: MySQL 8.0 introduced histograms, offering finer-grained insights into column data distributions.
  • Optimizer enhancements: Each major release has expanded optimizer capabilities—adding features like Hash Join, Anti-Join, and continuously refining cost estimation algorithms.

Despite these efforts, MySQL’s historical architectural choices impose significant legacy constraints, making remediation far more challenging than in systems designed from the ground up with tight integration.

Root Cause of Logical Replication Latency

Logical replication (via the binary log, or Binlog) is a server-layer component, decoupled from the underlying storage engine. To ensure consistency across all engines, the Binlog must record changes after transaction commit, at a layer above the storage engine. This upper-layer, engine-decoupled design naturally prevents it from deeply understanding page-level changes within the storage engine. Instead, it sees only a global, linear sequence of operations.

In contrast, physical replication systems—such as Oracle Data Guard or SQL Server AlwaysOn—or solutions like InnoDB Cluster that replicate low-level page changes, allow replicas to perform data recovery rather than SQL re-execution, thereby preserving parallelism and performance closer to that of the primary.

Thus, the high latency of MySQL’s logical replication is fundamentally rooted in its pluggable storage engine architecture. This design forces the replication mechanism (server-layer Binlog) to operate independently of the storage engine (InnoDB), relegating it to inefficient SQL-based replay.

As a result, logical replication functions as a relatively bulky and inefficient data synchronization mechanism operating at the periphery of the database system, with a performance ceiling far below that of native replication solutions deeply integrated with the storage engine. Although MySQL has introduced improvements like parallel replication and WRITESET-based conflict detection, as long as the core architecture remains unchanged, this issue cannot be fully resolved.

Conclusion

In short, the strength of a native storage engine lies in the deep integration that delivers ultimate performance and simplicity, whereas the advantage of a pluggable storage engine lies in its flexibility.

A native storage engine is designed together with the rest of the database management system (such as the query optimizer, executor, transaction manager, recovery module, etc.), forming a tightly coupled storage management component. Typical examples are commercial database products like Oracle, SQL Server, and DB2.

Conversely, pluggable storage engines excel in flexibility and ecosystem diversity, allowing a single database to adapt to varied workloads (e.g., OLTP, OLAP, key-value stores) and fostering innovation and competition in storage technologies. Indeed, much of MySQL’s vibrant ecosystem stems directly from its pluggable storage engine architecture.