0

I'm currently running Debian testing, which as of right now is buster. Here is the scenario:

I have two services in particular, lighttpd and chrony, that will fail to start normally after booting. The weird part is that they will start normally ONLY IF I log in through the virtual console on tty1 as root. Logging in as a non-root user sometimes works as well, but it's inconsistent. Unless I do this, lighttpd will not complete its startup procedure. When I check the systemd service it only shows 2 php-cgi forked processes rather than 5-6 as it normally would, and while the status shows the process as being active and started, I'm unable to reach the webserver. In the case of chrony it will timeout and fail.

I can SSH into the machine without any issues but some systemd related commands will also hang, i.e. if I attempt to disable or restart either service, it will hang or timeout. Once I log in through tty1, it will resume. I only need to log in one time, then I can log out. From there everything works as normal, even through SSH. It is only after booting does the issue occur. If I log in through tty1 as soon as it is available after booting, both services will start fine.

As far as troubleshooting goes, things I have done were uninstalling lighttpd, php* and chrony including any packages that depended on them using apt purge and made sure any leftover files were removed then reinstalling. For lighttpd and chrony, I also added overrides to the systemd unit files for After= and Wants= to network-online.target instead of network.target but that had no effect.

I've also booted up with both services uninstalled. If I try to install chrony again through SSH, it will hang at the point where systemd creates the symlinks to enable the systemd unit files. When I log in through tty1 while it's at this point, it will continue and complete the setup. So it seems to me there's some sort of issue with the boot procedure systemd is going through that's causing this, as if the boot process somehow doesn't completely finish.

I've looked at some logs, but I wasn't able to find any information that would point me in the right direction in order to resolve this.

1 Answers1

3

After digging through the logs some more, I saw the message kernel: random: crng init done come up and the services would start shortly after that. It turns out it's a bug with the Linux kernel that was introduced in 4.16 according to this answer to a question regarding the same message.

Given that urandom was being blocked, even though it shouldn't, logging in through the virtual console seems was enough to gather additional entropy and let the services continue starting.

I installed haveged and the problem cleared up.

slm
  • 369,824