I have a systemd user service that potentially takes many minutes to properly shutdown. It's a very large HDF5 database (several GB large) and if the process does not stop cleanly, then the database is corrupt afterwards.
I've found many threads here like this one How to change systemd service timeout value? but sadly they didn't help me at all, I cannot increase the timeout.
Because I had a little of trouble with the timeouts, I wrote this example:
#!/bin/bash
RUNNING=true
VAR_RUN=${HOME}/.run
PIDFILE=${VAR_RUN}/mondas_ctrl_launcher.pid
LOG_DIR=${HOME}/logs
LOG_FILE=${LOG_DIR}/mondas_ctrl_launcher.log
MONDAS_BASE=${HOME}/src/mondas
on_sigint()
{
RUNNING=false
}
log()
{
now=$(date)
message="${now}: ${@}"
echo ${message}
echo ${message} >> "${LOG_FILE}"
}
taken from
https://blog.dhampir.no/content/sleeping-without-a-subprocess-in-bash-and-how-to-sleep-forever
Execute this with BASH as it uses bash extensions
snore()
{
local IFS
[[ -n "${_snore_fd:-}" ]] || { exec {_snore_fd}<> <(:); } 2>/dev/null ||
{
# workaround for MacOS and similar systems
local fifo
fifo=$(mktemp -u)
mkfifo -m 700 "$fifo"
exec {_snore_fd}<>"$fifo"
rm "$fifo"
}
read ${1:+-t "$1"} -u $_snore_fd || :
}
_mondas_ctrl()
{
# my true program that starts a lot of
# processes in a tmux session
# mondas_ctrl "${@}" >> "${LOG_FILE}" 2>&1
# doing just "true" for testing purposes
true
}
mkdir -p "${VAR_RUN}"
mkdir -p "${LOG_DIR}"
case "${1}" in
start)
log "Starting mondas PWD: $(pwd)"
_mondas_ctrl start
log "mondas_ctrl start executed"
echo "${$}" > ${PIDFILE}
# SIGINT
trap on_sigint 2
while ${RUNNING} ; do
snore 1
done
log "Exiting sleep loop"
;;
stop)
log "Stopping mondas"
_mondas_ctrl stop
# simulating long shutdown
snore 275
log "mondas_ctrl stop executed"
if test -f "${PIDFILE}" ; then
kill -2 $(cat "${PIDFILE}")
rm -rf "${PIDFILE}"
fi
;;
*)
echo "usage: $0 start|stop" >&2
exit 1
;;
esac
and my systemd
user service:
[Unit]
Description=Mondas
Wants=network.target
After=network.target
[Service]
Type=simple
RemainAfterExit=no
ExecStart=%h/bin/mondas_ctrl_launcher start
ExecStop =%h/bin/mondas_ctrl_launcher stop
TimeoutStartSec=120
TimeoutStopSec=500
Restart=always
RestartSec=1
[Install]
WantedBy=default.target
So I enabled and started it and checked the timeout settings of the daemon
$ systemctl --user enable mondas2.service
$ systemctl --user start mondas2.service
$ systemctl --user show mondas2.service -p TimeoutStopUSec
TimeoutStopUSec=8min 20s
However if I execute reboot
as root, on the console I can see
[***] A stop job is running for User Manager for UID 1000 (20s / 2min)
and after 90 seconds and then systemd
just kills the process, the log file mondas_ctrl_launcher.log
is
missing the "mondas_ctrl stop executed"
log entry.
I even changed /etc/systemd/system.conf
and set
DefaultTimeoutStartSec=300s
DefaultTimeoutStopSec=300s
but when I execute reboot
the console still displays a max. timeout of 2
minutes and after 90 seconds the process is just killed. No matter what I do, I
cannot change this behaviour.
What am I doing wrong? Or did I just interpret the meaning of TimeoutStopSec
just wrong? Or could it be that the TimeoutStopSec
value in
the service file does not affect the real timeout when doing a reboot
or
poweroff
and only affects when stopping the service manually via systemctl --user stop
? If so,
how can I increase the reboot/poweroff timeout?
I'm testing this on a current Debian 10.5 installation.
user.slice
which spawnsuser-1000.slice
from this file/lib/systemd/system/user@.service
. – Pablo Sep 08 '20 at 17:01systemctl edit user@1000
created the file/etc/systemd/system/user@1000.service.d/override.conf
where I putTimeoutStopSec=500s
thus increasing the timeout. – Pablo Sep 08 '20 at 17:17