Is it possible to reserve resources for an always-up emergency console?

Question

I strongly despise any kinds of automatic OOM killers, and would like to resolve such situations manually. So for a long time I have

vm.overcommit_memory=1
vm.overcommit_ratio=200

But this way, when the memory is overflowed, the system becomes unresponsive. On my old laptop with HDD and 6 GB of RAM, I sometimes had to wait many minutes to switch to a text VT, issue some commands and wait for them to be executed. That's why I have numerous performance indicators to notice such situations beforehand, and often receive questions why would I need them at all. And they don't always help too, because if a memory overflow happened when I wasn't at the laptop, it's too late already.

I suspected the situation would be better on a newer laptop with SSD and 12 GB of RAM, but in fact it's even worse. I have zRam with vm.swappiness=200, which allows up to 16.4 GB of compressed swap, and when it's nearly extinguished, the system becomes even more unresponsive than on the old laptop, to the point even VT switch barely works, as well as I cannot SSH into the system from the local network, so my only resort is blindly invoking the kernel's manual OOM with Alt+SysRq+RF, which sometimes chooses to kill important process like dbus-daemon. I might make a daemon with a sound alert when the swap is almost full, but that's a partial stopgap again, as I may not come in time anyway.

In the past, I tried to mitigate such situations with thrash-protect. It sends SIGSTOP to greedy processes and then automatically SIGCONT-s them, which helped a lot to postpone the total lockup and resolve the situation manually, but in strong overload conditions, it starts freezing virtually everything (which can be explicitly allowlisted though). And it has a lot of irritating side effects. For example, if a shell is frozen, its child processes may remain frozen after thawing the shell. If two processes share a message bus and one of them is frozen, the messages are rapidly accumulated in the bus, which leads to rapidly growing RAM usage again, or lockups (graphical servers and multi-process browsers are especially prone to this).

I tried to run sshd with a -20 priority, like suggested in the similar question, but that doesn't really help: it's as unresponsive as with the default priority.

I would like to have some emergency console which is always locked in RAM and is usable regardless of how overloaded the rest of the system is. Something akin to Ctrl+Alt+Del screen in Windows NT≥6, or even better. Given that it's possible to reserve some RAM with the crashkernel parameter, which I use for kdump, I suspect it's possible to exploit this or some other kernel mechanism for the task too?

Somewhat off topic, but why don't you consider earlyoom? It is configurable to intervene before the Kernel OOM, and can prefer killing specific processes and avoid others. It also sends a SIGTERM in the hope of gracefully terminating the process, or SIGKILL if the pressure becomes too much. — td211, Oct 23 '23 at 17:25
@td211 because there's no stable set of criteria of what is safe to kill. For example, it's usually okay to kill a greedy tab if that's Element Web or so, but otherwise I would rather prefer to save it and kill something other. Or sometimes I'd prefer to not kill anything at all, and instead freeze some processes and let others gracefully exit (but that does not make sense on the new laptop anymore as it reaches the state when there's almost no free swap much quicker, on the old one disk cache issues and thrashing usually arose earlier). — bodqhrohro, Oct 23 '23 at 22:20
My criteria is to avoid killing the package manager, DE and login manager, and prefer killing browser tabs if necessary. As I said, it sends SIGTERM first, then SIGKILL if the pressure is too much. — td211, Oct 24 '23 at 04:34

score 1 · Accepted Answer · answered Oct 24 '23 at 04:40

1

For your use case try the mlockall system call to force a specific process to never be swapped, thus avoid swap thrashing slowdown.

I would recommend earlyoom with custom rules over this hack.

answered Oct 24 '23 at 04:40

td211

374

Hmm, I had experimented with hakavlad's prelockd back in 2020, which does exactly this, but dropped it for some reason, don't remember why. Maybe because it locked too much for my old laptop (≈0.5 GB with default configuration), or I considered it too bloated because it's in Python, or noticed some unwanted side effects too. Dug it out now and seems like it indeed helps to keep critical processes responsive when zRam swap is completely exhausted, even though VT switch still can freeze. Good enough for now, thanks. – bodqhrohro Oct 24 '23 at 22:19
Actually, I got this info from reading the README of earlyoom. It does this so that the daemon would remain responsive under pressure. – td211 Oct 25 '23 at 15:06

score -1 · Answer 2 · answered Oct 23 '23 at 17:07

You need a swap partition or a contiguous swap file.

One uses swap space to control what happens when programs allocate all the real memory, and want more. After all releasable cache (some cached blocks are "in use", and cannot be freed) has been released, the system enters the Out-Of-Memory state. In the Out-Of-Memory condition, with swap, some task's memory is written to disk (swapped out), freed for reuse, and later returned to memory (swapped in) when the task runs. Without swap, the system may freeze, the dreaded OOM-Killer (a pseudo process, hard-coded in the kernel) runs, and picks a process to KILL, in order to free memory. The OOM-Killer is known for inconvenient choices.

System hibernation requires a RAM-sized, contiguous swap area.

Read man mkswap fallocate filefrag swapon fstab.

It seems like you missed the question completely. I clearly mentioned that: 1) I have explicitly tuned overcommit so automatic OOM killer almost never comes; 2) I have zRam partitions and the freeze happens when they're filled completely (which is the case for the disk swap too). Swap postpones the issue, but does not eliminate it completely. — bodqhrohro, Oct 23 '23 at 23:36

Is it possible to reserve resources for an always-up emergency console?

2 Answers2