2

This is my third question in a day about hibernation, after the one about hibernating in a dual boot with shared swap, and another about hibernating in a dual boot with a shared writable partition.

I have realized that dangers of hibernation concern also single boot machines. For example, you may hibernate, power up your computer, and select a wrong kernel from the GRUB menu. If I understand correctly, this may seriously damage your system. Moreover, in NixOS at boot you are choosing not just a kernel to boot, but a complete system among several independent "snapshots."

It looks quite wrong to me that a sequence of innocent and common actions, where the user does not even get a warning, can destroy the system. Thus, it seems to me that without some kind of a safeguard hibernation is an undesirable feature.

Have anybody found any solution to this problem? Are there any workarounds?

The most logical solution IMO would be to disallow loading any system other than the hibernated one, or making systems clear the swap data at boot is something does not match up.


To be more clear, let me emphasize that I am talking about problems in a "single boot" setup, where multiple version (kernels, snapshots) of the same system can exist and be loaded.

Alexey
  • 1,988
  • Common systems do not have multi-OS and multi-kernel boot options which would make hibernation a standard problem, especially not sharing a swap partition. Since you implemented those and asked so many questions about them, you should be well aware of the specific reasons this can cause you issues and avoid them. – Julie Pelletier Jun 05 '16 at 16:31
  • Sorry, what did you mean i implemented? – Alexey Jun 05 '16 at 16:34
  • You installed that system and set it up as it is. Most people have a single OS on their machine which would never be subject to this kind of issue as you know. Those that do have multiple OS still normally only use default boot options. Those knowledgeable enough to install multiple boot options are responsible in knowing the issues you are facing. I don't see why you make such a case of something that is not really an issue as soon as you realize its possibility. Just either stop hibernating, or stop switching boot options with the same swap space! – Julie Pelletier Jun 05 '16 at 16:38
  • Nothing is an issue as long as you do not do it. Anyway, my question was not about dual boot. Upgrading to a new kernel in Ubuntu usually leaves the previous ones in the GRUB menu. Also, there is absolutely no reason to not have several versions of your system accessible from the boot menu. Maybe you are right that all but one should be hidden, and when accessing them, there should be a warning that the system must not have been hibernated. – Alexey Jun 05 '16 at 16:44
  • I have not installed anything yet, but this is irrelevant. – Alexey Jun 05 '16 at 16:45
  • 1
    Alexey, I do not understand your problem. The thing you are trying to use is not designed to be used in a way you want. All caveats are documented and warning signs are placed in the docs. Questions like "is it evil or not" assume opinion based answers IMO. You perfectly know that it is possible to nail a nail with a microscope, but what's the purpose of the latter one? – Serge Jun 05 '16 at 18:38
  • I wanted to know educated opinions about the problem and existing or planned solutions. Sorry about the title if it is misleading. – Alexey Jun 05 '16 at 18:41
  • I changed the title (the original one was "Is hibernation 'evil'?"). – Alexey Jun 06 '16 at 18:32

2 Answers2

2

I do think the shared filesystem situation is kind of evil :(. It is mitigated by a patchwork of different measures, but there are undoubtedly plenty of holes you can fall through.

The shared partition case is kind of nice in that once you know it's horribly dangerous, you can "just" avoid setting the system up that way. Despite how useful it would be if it wasn't so dangerous. However something like the memory card slot in my Thinkpad, or the common USB stick, is somewhat harder to keep under control.

  1. The most common, most simple case: dual-booting between Windows and a single Linux OS, eventually forced NTFS-3G to address the problem. It should warn quite loudly if you try to mount a Windows system partition which includes a hibernation image.

    I'm not sure about secondary NTFS partitions. There's certainly a potential mechanism (the "dirty" bit). I think at least some versions of NTFS-3G could give a warning first, but I would definitely want to test before assuming otherwise. link: search for "dirty".

    (This doesn't mean using NTFS to share between different Linux installs is necessarily a good idea. IIRC once your NTFS is marked as dirty, the recommended way to make sure it's fixed is using Windows. Or reformatting it :).

  2. Selecting the wrong kernel version but the right OS will not cause any more damage than a power failure. There's a kernel version check to avoid this causing any subtle problems. Linux hibernation software used to give you a prompt so you could reboot & try again if you wanted, but more recently it just seems to go ahead and wipe the hibernation image. You'll notice that modern system software doesn't self-destruct when your laptop runs of battery unexpectedly. Some application software will inevitably not be as well written.

  3. Mounting using udisks (e.g. via the GUI) should include the option errors=remount-ro by default. As soon as the filesystem actually notices corruption, it will stop writing any further data to it. This does not prevent filesystem corruption altogether. However in many cases it will avoid the worst case where you keep running unaware or confused, and massive corruption spreads as the filesystem continues to be written to.

  4. I imagine Nix snapshots might cause the worse problem, but only if you have created snapshots using different swap (hibernation) partitions. And if both of them still exist. I suggest this is unfortunate but not something that will happen frequently. The main reason for Nix is package management. The conditions show how you can to exclude it: just delete the old swap partition first.

It's not conceptually difficult to fix for a Linux filesystem, i.e. to fail resume from hibernation instead of causing near-certain filesystem corruption. AFAIK it's just not been done. A basic check could be implemented on most filesystems using the dirty bit, though on it's own that's not strong enough as FAT filesystems mounted by current Linux OS's will tend to stay dirty, once they've been marked dirty due to a power failure.

I don't think e.g. a GNOME-based OS is going to try and make shared filesystems + hibernation actually usable any time soon, because it's a hard conflict to resolve. The filesystem can't be unmounted if any of the files on it are open. And it's likely some application would respond poorly if their open files were revoked. In theory you could handle it like other interleaved accesses (by a different program, or by filesystem access over the network). In practice, this would require a lot of work on the kernel and still be somewhat surprising, when you forget that you had a hibernated application accessing the same file.

sourcejedi
  • 50,249
  • Thank you, but could you clarify point (2) a bit, please: why booting a wrong kernel on wake-up from hibernation is safe? How is this related to power failure? My question is mostly about single OS setup. – Alexey Jun 06 '16 at 07:36
  • If you hibernate, erase the image, and then boot normally, that's the same as removing power while the system was running. – sourcejedi Jun 06 '16 at 07:38
  • Sorry, i do not understand something: if you just select a wrong (older) kernel from GRUB menu, i do not think this will erase the hibernation image, i think it will be loaded. – Alexey Jun 06 '16 at 07:40
  • Single OS setup is the safest case. I think you'd only have to worry about removable devices. – sourcejedi Jun 06 '16 at 07:40
  • Alexey - no. The resuming kernel checks the kernel version of the hibernation image. It simply will not load an image created by a different kernel version. – sourcejedi Jun 06 '16 at 07:42
  • Does this equally apply in a dual boot setup where the two OS have different kernels? – Alexey Jun 06 '16 at 07:43
  • The version check is why the initrd has to wipe the hibernation image. I think if it didn't you'd get a normal boot, but you wouldn't be able to use the swap partition, and you'd have this horribly dangerous stale hibernation image hanging around. But that depends on a couple of details I might be mis-remembering. – sourcejedi Jun 06 '16 at 07:44
  • If you boot the wrong OS and they share a filesystem, you're toast. If they share a swap partition I don't think it will have the same problem though; if that works it would even give you some protection against running one of the systems while the other was hibernated. – sourcejedi Jun 06 '16 at 07:46
0

The point of hibernation is to power down the hardware. It does not really help you run multiple operating systems on the same hardware. You can do that, but only if the OSes are completely independent (or close enough.

If you have a single OS, you need to be careful not to hibernate a deleted kernel. It's best not to do kernel upgrades until you're ready to reboot.

Dual boot is rather evil. If you want to run multiple operating systems, run them in virtual machines.

  • Thanks, but my question was mostly about single boot. Not only there can be multiple kernels, but multiple versions of the system (NixOS). – Alexey Jun 06 '16 at 00:03
  • @Alexey Multiple versions of a system with the same kernel is still different OSes and multiple boot. If you choose an option at boot time, it isn't single boot. – Gilles 'SO- stop being evil' Jun 06 '16 at 10:17
  • In this sense almost every system at some periods of its life is or should be "dual boot": when you upgrade a system, it is better not to do it destructively and to keep the old version around for a while. – Alexey Jun 06 '16 at 18:30