9

This:

$ seq 100000 | xargs -P0 -n1 -I {} bash -c 'echo {};sleep {}'
:
5514
bash: fork: retry: No child processes

started complaining around 5500 when the system had 11666 processes running. Now, 11666 was really surprising to me given:

$ ulimit -u
313370
$ cat /proc/sys/kernel/pid_max
313370
$ grep hard.*nproc /etc/security/limits.conf
*                hard    nproc           313370

Why can I only run 11600 processes?

Edit:

Testing on another user I get to 6100 (i.e. 12200 procs), thus totalling 24000 procs. So the limit is not systemwide.

$ uname -a
Linux aspire 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ grep -i tasksmax /etc/systemd/*
/etc/systemd/logind.conf:#UserTasksMax=12288
/etc/systemd/system.conf:#DefaultTasksMax=

So the 12288 could be the culprit. I changed that to 1000 and did:

sudo systemctl daemon-reexec
sudo systemctl restart systemd-logind

If I now log in as a user I have not logged in as before, the new limit works. But if I log in as a user that has recently been logged in, the limit active at the first login is enforced. So the limit is cached somewhere.

Using the above I tested up to 30000 procs and this works, but only for users that have not logged in before.

So what is caching the limit from /etc/systemd/logind.conf? And how can I flush that cache?

The new limit is well above 60000 procs (and could possibly be the 313370 as I would expect).

Ole Tange
  • 35,514
  • 1
    Are you running out of memory? Is this an openvz container? – jordanm Apr 16 '18 at 00:45
  • 1
    There is not enough information in this question for it to be answerable. For starters, the version of Linux (yes, the kernel) is important. So too is what system and service managers are being used. Different kernels and different system/service managers set limits differently. It's even important what specific version of the system/service manager one is using. Hint: https://unix.stackexchange.com/questions/253903/ – JdeBP Apr 16 '18 at 00:50
  • However, even if here are some lacks in background information and as far as I understand the given summary, you may take advantage from Do changes in limits.conf require ... from working with limits.conf and prlimit as mentioned in that thread. – U880D Apr 16 '18 at 12:33
  • 4
    @U880D It is clear, that the limit is not from limit.conf, but instead from /etc/systemd/logind.conf. The question is: How do I flush the cache of this, so the limits given in that file is respected if I ssh to localhost as myself. – Ole Tange Apr 16 '18 at 13:05

1 Answers1

5

The system in question runs systemd. This is one thing that uses cgroups to divide system resources among various groups of processes.

It is probable that the sysctl kernel.sched_autogroup_enabled = 1 is set. That would be a second thing dividing the system resources using cgroups.

There's a possibility that once a cgroup or a set of cgroups for a particular user has been initialized, it stays untouched until reboot.

I don't have a way to hunt whether it is because of systemd or autogroup, whether it is because of process number limitation or because of memory limitation (inside a cgroup), nor the time to hunt in the source code. Wanted to comment instead of answering, but I don't have enough reputation.

Jake F
  • 346