1. Flash-based storage
Does it depend upon the disk type (traditional hard drives vs. solid-state disks) or any other variable that I might not be aware of? Does it happen (if it does) only in Linux or is this present in other OSes?
When you have a choice, you should not allow flash-based storage to lose power without a clean shutdown.
On low-cost storage like SD cards, you can expect to lose entire erase-blocks (several times larger than 4KB), losing data which could belong to different files or essential structures of the filesystem.
Some expensive SSDs may claim to offer better guarantees in the face of power failure. However third-party testing suggests that many expensive SSDs fail to do so. The layer that remaps blocks for "wear levelling" is complex and proprietary. Possible failures include loss of all data on the drive.
Applying our testing framework, we test 17 commodity SSDs from six different vendors using more than three thousand fault injection cycles in total. Our experimental results reveal that 14 of the 17 tested SSD devices exhibit surprising failure behaviors under power faults, including bit corruption, shorn writes, unserializable writes, metadata corruption, and total device failure.
2017: https://dl.acm.org/citation.cfm?id=2992782&preflayout=flat
2013: https://www.usenix.org/system/files/conference/fast13/fast13-final80.pdf?wptouch_preview_theme=enabled
2. Spinning hard disk drives
Spinning HDDs have different characteristics. For safety and simplicity, I recommend assuming they have the same practical uncertainty as flash-based storage.
Unless you have specific evidence, which you clearly don't. I don't have comparative figures for spinning HDDs.
A HDD might leave one incompletely written sector with a bad checksum, which will give us a nice read failure later on. Broadly speaking, this failure mode of HDDs is entirely expected; native Linux filesystems are designed with it in mind. They aim to preserve the contract of fsync()
in the face of this type of power loss fault. (We'd really like to see this guaranteed on SSDs).
However I'm not sure whether Linux filesystems achieve this in all cases, or whether that's even possible.
The next boot after this type of fault may require a filesystem repair. This being Linux, it is possible that the filesystem repair will ask some questions that you do not understand, where you can only press Y and hope that it will sort itself out.
2.1 If you don't know what the fsync() contract is
The fsync() contract is a source of both good news and bad news. You must understand the good news first.
Good news: fsync()
is well-documented as the correct way to write file data e.g. when you hit "save". And it is widely understood that e.g. text editors must replace existing files atomically using rename()
. This is meant to make sure that you always either keep the old file, or get the new file (which was fsync()
ed before the rename). You don't want to be left with a half-written version of the new file.
Bad news: for many years, calling fsync() on the most popular Linux filesystem could effectively leave the whole system hanging for tens of seconds. Since applications can do nothing about this, it was very common to optimistically use rename() without fsync(), which appeared to be relatively reliable on this filesystem.
Therefore, applications exist which do not use fsync() correctly.
The next version of this filesystem generally avoided the fsync() hang - at the same time as it started relying on the correct use of fsync().
This is all pretty bad. Understanding this history is probably not helped by the dismissive tone and invective which was used by many of the conflicting kernel developers.
The current resolution is that the current most popular Linux filesystem defaults to supporting the rename() pattern without requiring fsync() implements "bug-for-bug compatibility" with the previous version. This can be disabled with the mount option noauto_da_alloc
.
This is not a complete protection. Basically it flushes the pending IO at rename() time, but it doesn't wait for the IO to complete before renaming. This is much better than e.g. a 60 second danger window though! See also the answer to Which filesystems require fsync() for crash-safety when replacing an existing file with rename()?
Some less popular filesystems do not provide protection. XFS refuses to do so. And UBIFS has not implemented it either, apparently it could be accepted but needs a lot of work to make it possible. The same page points out that UBIFS has several other "TODO" issues for data integrity, including on power loss. UBIFS is a filesystem used directly on flash storage. I imagine some of the difficulties UBIFS mentions with flash storage could be relevant to the SSD bugs.
sync
, and applications mustflush
to guarantee caches are written back, but even a sucessfullsync
does not guarantee write back to physical disk only that kernel caches are flushed to disk, which may have latency in the driver or disk hardware (e.g. on-drive cache that you lose) – crasic Aug 23 '18 at 02:41mmap()
syscall), which I think is often used on large files for improved performance, and edits may occur there in memory. – Sergiy Kolodyazhnyy Aug 25 '18 at 03:58