May 7, 2024
Summary: In this tutorial, you will learn how to tune the page cache in Linux.
Table of Contents
Introduction
Page cache is a disk cache which holds data of files and executable programs, for example pages with actual contents of files or block devices. Page cache (disk cache) is used to reduce the number of disk reads.
File system caching in Linux is a mechanism that allows the kernel to store frequently accessed data in memory for faster access. The kernel uses the page cache to store recently-read data from files and file system metadata.
For instance, when a program reads data from a file, the kernel performs several tasks:
- checks the page cache to see if the data is already in memory.
- if the data is in memory, the kernel simply returns the data from the cache.
- otherwise, it reads the data from the drive and stores a copy of it in the cache for future use.
In addition, the kernel uses the dentries cache to store information about file system objects. These file system objects include directories and inodes.
Hence, the page cache handles data from files while the dentries cache manages the file system objects.
Again, the kernel uses a Least Recently Used (LRU) algorithm to manage the page and dentries cache. In other words, when the cache is full and there’s more data to add, the kernel removes the least recently used data to make room for the new data.
Checking the Cache
The vmstat
command provides detailed information about virtual memory. In particular, it shows the amount of memory in use for caching:
$ vmstat
procs -----------memory---------- ---swap--- -----io---- --system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 6130448 11032 589532 0 0 422 52 160 362 3 3 76 18 0
The cache column shows the amount of memory used for file system caching in kilobytes. In addition, to get more details using the vmstat
command, we can use the -s flag:
$ vmstat -s
8016140 K total memory
1282340 K used memory
207744 K active memory
711356 K inactive memory
6133536 K free memory
11032 K buffer memory
589232 K swap cache
2097148 K total swap
0 K used swap
2097148 K free swap
3458 non-nice user cpu ticks
389 nice user cpu ticks
3371 system cpu ticks
60823 idle cpu ticks
20782 IO-wait cpu ticks
0 IRQ cpu ticks
34 softirq cpu ticks
0 stolen cpu ticks
494275 pages paged in
56168 pages paged out
0 pages swapped in
0 pages swapped out
170063 interrupts
384058 CPU context switches
1673971944 boot time
5151 forks
Alternatively, we can use the free
command to check the amount of file system cache memory in the system. It shows the memory usage in kilobytes under the buff/cache column:
$ free
total used free shared buff/cache available
Mem: 8016140 1284652 6130952 144680 600536 6353032
Swap: 2097148 0 2097148
The -m flag alters the command output values to megabytes. Notably, the value of the buff/cache column is the sum of the values of the buffer memory and swap cache rows for vmstat
.
Page cache settings
To optimize the page cache, we can modify several parameters:
- vm.vfs_cache_pressure
- vm.swappiness
- vm.dirty_background_ratio
- vm.dirty_background_bytes
- vm.dirty_ratio
- vm.dirty_bytes
- vm.dirty_writeback_centisecs
- vm.dirty_expire_centisecs
These parameters control the percentage of total system memory we can use for caching. They regulate the caching memory before the kernel writes dirty pages to the storage. Importantly, dirty pages are memory pages that aren’t written to secondary memory yet.
In general, we can use the sysctl
command to configure the file system cache in Linux. Also, the sysctl
command can modify kernel parameters in the /etc/sysctl.conf
file. This file contains system-wide kernel parameters that we can set at runtime.
vm.vfs_cache_pressure
The system parameter vm.vfs_cache_pressure, controls the tendency of the kernel to reclaim the memory used for caching directory and inode objects:
$ sudo sysctl -w vm.vfs_cache_pressure=50
vm.vfs_cache_pressure = 50
Here, we set the vfs_cache_pressure value to 50 via the -w switch of sysctl. Consequently, the kernel will prefer inode and dentry caches over the page cache. This can help improve performance on systems with a large number of files.
Notably, a higher value makes the kernel prefer to reclaim inodes and dentries over cached memory. On the other hand, a lower value makes it reclaim cached memory over inodes and entries. Hence, we can adjust the value according to our preference.
vm.swappiness
Swappiness controls how aggressively the kernel swaps memory pages. Lowering the value of swappiness means the kernel will be less likely to swap out less frequently used memory pages. Thus, the kernel will be more likely to keep these pages cached in RAM for faster access.
Further, we can again use sysctl to set the vm.swappiness parameter:
$ sudo sysctl -w vm.swappiness=10
vm.swappiness = 10
Here, the command sets the value of vm.swappiness to 10. Again, lower values will make the kernel prefer to keep more data in RAM. Thus, higher values make the kernel swap more.
vm.dirty_background_ratio
The vm.dirty_background_ratio parameter is the amount of system memory in percentage that can be filled with dirty pages before they’re written to the drive. For instance, if we set the value of the vm.dirty_background_ratio parameter of a 64GB RAM system to 10, it entails that 6.4GB of data (dirty pages) can stay in RAM before they’re written to the storage.
Now, let’s configure the value of vm.dirty_background_ratio for our system:
$ sudo sysctl -w vm.dirty_background_ratio=10
vm.dirty_background_ratio = 10
Alternatively, we can set the vm.dirty_background_bytes variable in place of vm.dirty_background_ratio. The *_bytes version takes the amount of memory in bytes. For example, we can set the amount of memory for dirty background caching to 512MB:
$ sudo sysctl -w vm.dirty_background_bytes=511870912
However, the *_ratio variant will become 0 if we set the * _bytes variant, and vice versa.
vm.dirty_ratio
Specifically, vm.dirty_ratio is the absolute maximum amount of system memory in percentage that can be filled with dirty pages before they’re written to the drive. At this level, all new I/O activities halt until dirty pages are written to storage.
Notably, the vm.dirty_bytes turns to 0 when we set a percentage value for vm.dirty_ratio and vice versa. To illustrate, let’s define the value for vm.dirty_ratio:
$ sudo sysctl -w vm.dirty_ratio=20
vm.dirty_ratio = 20
Similarly, the vm.dirty_ratio will become 0 if we configure a value in bytes for the vm.dirty _bytes.
dirty_expire_centisecs and dirty_writeback_centisecs
Of course, data cached in the system memory is at risk of loss in case of a power outage. Hence, to safeguard the system from data loss, the following variables dictate how long and how often data is written to secondary storage:
- vm.dirty_expire_centisecs
- vm.dirty_writeback_centisecs
The vm.dirty_expire_centisecs manages how long data can be in the cache before it’s written to drive. Let’s set the variable so that data can stay for 40 seconds in the cache:
$ sudo sysctl -w vm.dirty_expire_centisecs=4000
vm.dirty_expire_centisecs = 4000
In this case, cached info can stay up to 40 seconds before it’s written to the drive. Notably, 1s equals 100 centisecs.
Further, the vm.dirty_writeback_centisecs is the variable for how often the write background process checks to see if there’s data to write to secondary storage. Thus, the lower the value, the higher the frequency, and vice versa.
Let’s configure vm.dirty_writeback_centisecs to check the cache every 5 seconds:
$ sudo sysctl -w vm.dirty_writeback_centisecs=500
vm.dirty_writeback_centisecs = 500
Again, the 500 centisecs value is equal to 5 seconds.