Caches and Data Integrity
Page Cache
In Linux, When a normal application wants to save some data the flow usually looks something like this:
Linux uses all available memory (RAM) as a cache to speed up your applications I/O operations. This cache is called the "Page cache". Linux does this because writing to memory is much faster than writing to disk in terms of both latency and bandwidth. Since memory is easily reclaimed when applications request it, there is generally no associated performance penalty and the operating system might even report such memory as "free" or "available". If the cache is holding some data that is subsequently requested, the request can be fulfilled without ever hitting the disk, this improves the speed of read requests, whilst also reducing the number of requests the disk needs to handle.
A "page" that was created and still needs to be written to disk is called a "dirty page". In the event of sudden power loss, any "dirty pages" are lost.
Direct I/O
To prevent data loss in the event of sudden power loss, certain applications such as databases will use Direct I/O to bypass the pagecache and go directly to the disk.
Whilst this improves data integrity, it comes at a cost in performance.
Disk Cache
You may have noticed that the diagrams above have their own cache. This cache of data improves performance in the same way that the page cache does, but disk caches are usually quite small. For example, my WD 3TB Red has only 64 MB of cache, compared to the many GB of spare RAM I usually have. Unlike the page cache, the disk cache is not bypassed with Direct I/O, and can also lead to data loss in the event of sudden power loss. This cache can be disabled to prevent such an issue, but most enterprise grade parts will have an alternative solution to prevent having to take such action.
- Enterprise SSDs have capacitors built into them, giving them enough power to to finish writing their caches to non-volatile storage.
- The caches on the storage devices can be non-volatile.
- Servers can be fitted with RAID cards that have their own cache backed up by a battery backup unit (BBU). These can allow the RAID card's memory to stay on-line with the data for many hours until the disks come back on-line.
- In this case one may still need to disable the disks' cache, but there will be no performance hit because there is still a cache on the RAID card.
First published: 16th August 2018