I have a process which several times now has stopped responding and appears to be completely locking up. It doesn't respond to any attempt at strace or peeking with gdb (gdb just hangs on a wait4() syscall). The process is runnable, and is not waiting on a syscall (/proc/X/syscall: running
) or in uninterruptable sleep (/proc/X/status: State: R (running)
).
What state is this process in exactly? Is this possibly a kernel bug of some type?
The process is redis, and this has happened a few times now. Only thing that can kill the process is a reboot, it seems. OS is Cent 7.
Edit: Kernel version is 3.10.0-123.13.2.el7.x86_64. Trying an update to 3.10.0-229.11.1.el7 to see if that makes any difference.
dmesg
output? – Ho1 Aug 07 '15 at 21:14dmesg
might also show if "hung task detector" triggers (if that's enabled it's supposed to show when tasks are stuck inside the kernel for too long). – sourcejedi Aug 07 '15 at 21:15/proc/X/syscall
is supposed to show while a page fault is being serviced (e.g. reading pages of a file throughmmap()
memory). – sourcejedi Aug 07 '15 at 21:17dmesg
contains nothing related to this task, and nothing out of the ordinary. no 'hung task' detection. – alienth Aug 08 '15 at 00:24/proc/<pid>/stack
(and/proc/<pid>/task/*/stack
) contain? Has that process got several threads? – Stéphane Chazelas Aug 09 '15 at 20:49