2

Relating to question : What if 'kill -9' does not work?

I have following situation : zombie process with threads, not collected by init :

[root@Arch64]# ps auxH | grep java
gwpl       569  0.0  0.0      0     0 ?        Zl   04:23   0:00 [java] <defunct>
gwpl       569  5.5 49.0 1466648 375572 ?      Rl   07:25  23:55 [java] <defunct>
gwpl       569 16.0 49.0 1466648 375572 ?      Rl   12:27  20:54 [java] <defunct>
gwpl       569 17.9 49.0 1466648 375572 ?      Rl   12:47  19:48 [java] <defunct>
root     10466  0.0  0.0   6740   628 pts/0    S+   14:38   0:00 grep java
[root@Arch64]# pstree -s 569
init---java---3*[{java}]

Can I do anything about that ?

Or is it init bug as suggested in comment to https://unix.stackexchange.com/a/11173/9689 ?

If it's a bug, what should I dump to help fixing it?


Above listing uses following status codes: Zl, Rl, S+. Here is cheatsheet from man ps to decode them:

PROCESS STATE CODES
       (...)
       R    Running or runnable (on run queue)
       S    Interruptible sleep (waiting for an event to complete)
       (...)
       Z    Defunct ("zombie") process, terminated but not reaped by its parent.

       For BSD formats and when the stat keyword is used, additional characters may be displayed:
       (...)
       L    has pages locked into memory (for real-time and custom IO)
       (...)
       +    is in the foreground process group

2 Answers2

1

This is probably too late to do any good, but I have to wonder whether maybe process 569 isn't entirely a zombie. Maybe this is what you get on this OS (which OS is it?) when the initial thread has terminated but other threads are still running. If that's the case, kill(1) should still be effective. If it isn't, the next thing I would try is using tgkill(2), or your OS's equivalent, on the threads listed in R-state (this will require you to write some C, as there probably aren't canned shell utilities to invoke those system calls).

Also, attempting to attach strace or gdb to the process may reveal something useful for diagnosis.

A bug in init's reaping of zombies is extraordinarily unlikely; if this is a bug, I would say it is more likely to be a kernel bug in which multithreaded processes are (under some conditions) not properly disposed of.

zwol
  • 7,177
  • At least in Linux, threads are just kernel tasks, just like regular one-thread processes. Sure, there might be a glitch in handling them, but it is extremely unlikely. Zombies are the result of parents not handling the exit of their children; haven't ever seen a different reason. – vonbrand Mar 30 '14 at 01:20
1

Init always waits, goes the old saying. So a bug in the kernel is rather unlikely. What is the ppid of process 569? ps -alef would give you that, or, if you are of the BSD persuasion, ps axo stat,pid.ppid,comm. If the parent is not init, 569 is probably an ordinary Zombie. And as zombies are only dangerous when there's a lot of them, you can just let this one stand in the corner and have the next reboot dispose of it.

wallenborn
  • 111
  • 2