2

I've created a systemd service that uses ssh to call home to a central server.

[Unit]
Description=Create a tunnel in the cloud back to SSH on this machine
After=network-online.target

[Service]
User=mindhive
ExecStart=/usr/bin/ssh -o ServerAliveInterval=20 -o ServerAliveCountMax=3 -o ExitOnForwardFailure=yes -o StrictHostKeyChecking=no -i /home/mindhive/.ssh/tunnel.id_rsa -N -T -R1822:localhost:22 tunnel@***server-hidden***
RestartSec=60
Restart=always

[Install]
WantedBy=multi-user.target

I use this on many of our servers, all Ubuntu 16.04. Suddenly one of them has stopped working. I can see from the logs (below) it's to do with access to .ssh in the homedir. The service is set to Restart=always and RestartSec=60, so after a reboot it sits there trying again every minute, failing each time. However, if I manually sudo systemctl restart ssh-tunnel.service then it starts no problem.

You can see from the logs below (output of journalctl) it failing and retrying but when started manually at 14:54:10 it starts fine.

From Googling so far I've tried adding both WorkingDirectory=~ and ProtectHome=off to the service. But that's made no difference.

Why can't ssh access the user's homedir when started by systemd after a reboot, but it can if started manually by systemctl?

-- Reboot --
Feb 24 14:50:12 ***servername*** systemd[1]: Started Create a tunnel in the cloud back to SSH on this machine.
Feb 24 14:50:12 ***servername*** ssh[1252]: Warning: Identity file /home/mindhive/.ssh/tunnel.id_rsa not accessible: No such file or directory.
Feb 24 14:50:13 ***servername*** ssh[1252]: Could not create directory '/home/mindhive/.ssh'.
Feb 24 14:50:14 ***servername*** ssh[1252]: Failed to add the host to the list of known hosts (/home/mindhive/.ssh/known_hosts).
Feb 24 14:50:15 ***servername*** systemd[1]: ssh-tunnel.service: Main process exited, code=exited, status=255/n/a
Feb 24 14:50:15 ***servername*** systemd[1]: ssh-tunnel.service: Unit entered failed state.
Feb 24 14:50:15 ***servername*** systemd[1]: ssh-tunnel.service: Failed with result 'exit-code'.
Feb 24 14:51:15 ***servername*** systemd[1]: ssh-tunnel.service: Service hold-off time over, scheduling restart.
Feb 24 14:51:15 ***servername*** systemd[1]: Stopped Create a tunnel in the cloud back to SSH on this machine.
Feb 24 14:51:15 ***servername*** systemd[1]: Started Create a tunnel in the cloud back to SSH on this machine.
Feb 24 14:51:15 ***servername*** ssh[1367]: Warning: Identity file /home/mindhive/.ssh/tunnel.id_rsa not accessible: No such file or directory.
Feb 24 14:51:16 ***servername*** ssh[1367]: Could not create directory '/home/mindhive/.ssh'.
Feb 24 14:51:17 ***servername*** ssh[1367]: Failed to add the host to the list of known hosts (/home/mindhive/.ssh/known_hosts).
Feb 24 14:51:18 ***servername*** systemd[1]: ssh-tunnel.service: Main process exited, code=exited, status=255/n/a
Feb 24 14:51:18 ***servername*** systemd[1]: ssh-tunnel.service: Unit entered failed state.
Feb 24 14:51:18 ***servername*** systemd[1]: ssh-tunnel.service: Failed with result 'exit-code'.
Feb 24 14:52:18 ***servername*** systemd[1]: ssh-tunnel.service: Service hold-off time over, scheduling restart.
Feb 24 14:52:18 ***servername*** systemd[1]: Stopped Create a tunnel in the cloud back to SSH on this machine.
Feb 24 14:52:18 ***servername*** systemd[1]: Started Create a tunnel in the cloud back to SSH on this machine.
Feb 24 14:52:18 ***servername*** ssh[1370]: Warning: Identity file /home/mindhive/.ssh/tunnel.id_rsa not accessible: No such file or directory.
Feb 24 14:52:19 ***servername*** ssh[1370]: Could not create directory '/home/mindhive/.ssh'.
Feb 24 14:52:20 ***servername*** ssh[1370]: Failed to add the host to the list of known hosts (/home/mindhive/.ssh/known_hosts).
Feb 24 14:52:20 ***servername*** systemd[1]: ssh-tunnel.service: Main process exited, code=exited, status=255/n/a
Feb 24 14:52:20 ***servername*** systemd[1]: ssh-tunnel.service: Unit entered failed state.
Feb 24 14:52:20 ***servername*** systemd[1]: ssh-tunnel.service: Failed with result 'exit-code'.
Feb 24 14:53:21 ***servername*** systemd[1]: ssh-tunnel.service: Service hold-off time over, scheduling restart.
Feb 24 14:53:21 ***servername*** systemd[1]: Stopped Create a tunnel in the cloud back to SSH on this machine.
Feb 24 14:53:21 ***servername*** systemd[1]: Started Create a tunnel in the cloud back to SSH on this machine.
Feb 24 14:53:21 ***servername*** ssh[1393]: Warning: Identity file /home/mindhive/.ssh/tunnel.id_rsa not accessible: No such file or directory.
Feb 24 14:53:21 ***servername*** ssh[1393]: Could not create directory '/home/mindhive/.ssh'.
Feb 24 14:53:22 ***servername*** ssh[1393]: Failed to add the host to the list of known hosts (/home/mindhive/.ssh/known_hosts).
Feb 24 14:53:23 ***servername*** systemd[1]: ssh-tunnel.service: Main process exited, code=exited, status=255/n/a
Feb 24 14:53:23 ***servername*** systemd[1]: ssh-tunnel.service: Unit entered failed state.
Feb 24 14:53:23 ***servername*** systemd[1]: ssh-tunnel.service: Failed with result 'exit-code'.
Feb 24 14:54:10 ***servername*** systemd[1]: Stopped Create a tunnel in the cloud back to SSH on this machine.
Feb 24 14:54:10 ***servername*** systemd[1]: Started Create a tunnel in the cloud back to SSH on this machine.
  • Try it with Requires=home-mindhive.mount and After=home-mindhive.mount. – jasonwryan Feb 24 '18 at 03:48
  • @jasonwryan Added those under [unit] and no difference after a reboot. I had tried WorkingDirectory=~ before because I thought that implicitly added these dependencies. – Damon Maria Feb 24 '18 at 04:01
  • And is the service file at /etc/systemd/system/home-mindhive.mount.wants/whatever.service? – jasonwryan Feb 24 '18 at 06:07
  • No, there's no /etc/systemd/system/home-mindhive.mount.wants (or anything similar, just the standard system ones). I have both the home-mindhive.mount clauses from above and WorkingDirectory=~. systemctl status doesn't report anything out of the ordinary. Also, I'm guessing it's not an issue with starting too early before a requirement is met, since systemd retries starting it again and again and it continuously fails. – Damon Maria Feb 24 '18 at 09:13
  • @jasonwryan Figured it out. Answer below. Trying what you suggested do unearth the right evidence. – Damon Maria Mar 19 '18 at 10:12

1 Answers1

2

OK. Managed to figure it out myself... eventually. The issue is that Ubuntu had been installed with encrypted home dir. So, that's why the files in ~/.ssh weren't accessible to the service until I logged in (therefore decrypting the home dir) and restarted it.

Removing the encryption was not easy. I found these instructions best.