My understanding is that hard drives and SSDs implement some basic error correction inside the drive, and most RAID configurations e.g. mdadm will depend on this to decide when a drive has failed to correct an error and needs to be taken offline. However, this depends on the storage being 100% accurate in its error diagnosis. That's not so, and a common configuration like a two-drive RAID-1 mirror will be vulnerable: suppose some bits on one drive are silently corrupted and the drive does not report a read error. Thus, file systems like btrfs and ZFS implement their own checksums, so as not to trust buggy drive firmwares, glitchy SATA cables, and so on.
Similarly, RAM can also have reliability problems and thus we have ECC RAM to solve this problem.
My question is this: what's the canonical way to protect the Linux swap file from silent corruption / bit rot not caught by drive firmware on a two-disk configuration (i.e. using mainline kernel drivers)? It seems to me that a configuration that lacks end-to-end protection here (such as that provided by btrfs) somewhat negates the peace of mind brought by ECC RAM. Yet I cannot think of a good way:
- btrfs does not support swapfiles at all. You could set up a loop device from a btrfs file and make a swap on that. But that has problems:
- Random writes don't perform well: https://btrfs.wiki.kernel.org/index.php/Gotchas#Fragmentation
- The suggestion there to disable copy-on-write will also disable checksumming - thus defeating the whole point of this exercise. Their assumption is that the data file has its own internal protections.
- ZFS on Linux allows using a ZVOL as swap, which I guess could work: http://zfsonlinux.org/faq.html#CanIUseaZVOLforSwap - however, from my reading, ZFS is normally demanding on memory, and getting it working in a swap-only application sounds like some work figuring it out. I think this is not my first choice. Why you would have to use some out-of-tree kernel module just to have a reliable swap is beyond me - surely there is a way to accomplish this with most modern Linux distributions / kernels in this day & age?
- There was actually a thread on a Linux kernel mailing list with patches to enable checksums within the memory manager itself, for exactly the reasons I discuss in this question: http://thread.gmane.org/gmane.linux.kernel/989246 - unfortunately, as far as I can tell, the patch died and never made it upstream for reasons unknown to me. Too bad, it sounded like a nice feature. On the other hand, if you put swap on a RAID-1 - if the corruption is beyond the ability of the checksum to repair, you'd want the memory manager to try to read from the other drive before panicking or whatever, which is probably outside the scope of what a memory manager should do.
In summary:
- RAM has ECC to correct errors
- Files on permanent storage have btrfs to correct errors
- Swap has ??? <--- this is my question