Why cannot we kill a zombie?

Question

I am editing this question since it was marked duplicate as another question about how to kill a zombie process.

I am not looking for an answer to how to kill a zombie process. I do not have zombies on my system and I am aware of how zombies are created.

Let me try to re-phrase the question. Presently, these are the accepted methods of removing a zombie:

Sending SIGCHLD to the parent. Works theoretically, not always in practice since one of the reasons the zombie was created in the first place could be because the parent wasn't responding properly to SIGCHLD.
Killing the parent process.
Fixing the bug in the program that created the Zombie.
Rebooting.
Or as mentioned at here
Or for this reason mentioned by @richard in the comments to my question before this edit:

..to prevent the pid being reused. The parent has the pid of the child and may signal the child (may try to kill it), just from the pid it recieved when it created the child. It would be bad for the pid to be reused. Therefore the child remains in a zombie state until the parent acknowledges the death of the child, or the parent dies.

Now my questions are:

Why is that there is no straight-forward and direct method to clean up a zombie or zombies.
What would be the side-effects/consequences if zombie could've been killed with a signal.
What is stopping the *NIX maitainers to create a SIGNAL or a command (my apologies, if the phrase 'creating a new signal' is not technologically acceptable) that cleans up a zombie.

Pavel Šimerda · Answer 1 · 2015-01-04T20:26:43.843

3

The proces is already dead at the time. It doesn't make sense to kill it again. It is still recorded in the process table to allow the parent ti pick up its status.

Note that all processes become zombies after being killed. You just don't see them because most parent processes clean up their children very quickly. You might want to file a bug report if a software doesn't clean up its zombies and they only get clean up by init once the parent process exits.

Your additional question is why SIGKILL doesn't remove it from the process table. You should first tell us why it should. I'm not aware of any single reason to let the user remove a defunct process from the process table. In operating system design, you should always have a good reason to do something before you ask why not to do it. You're asking for a feature with no actual use case.

Apart from the lack of purpose, it would have bad consequences. The parent calls wait() and/or waitpid() to learn the status of the children that exited. The result of the call is consistent whether it is called before the process becomes defunct, before you issue the SIGKILL, or after that.

If the kernel didn't keep the records (i.e. the process would be removed from the process table), the behavior would have to be inconsistent and the parent would have to expect the inconsistency and cope with it for no valid reason. In systems that would reuse old pid values, not keeping the records could result in much more severe problems like a software killing an entirely different process by accident.

See also: http://en.wikipedia.org/wiki/Zombie_process

edited Jan 04 '15 at 20:26

answered Jan 04 '15 at 09:15

Pavel Šimerda

6,544

Yes. I am aware of that. My question is why wouldn't the SIGKILL remove the zombie from the process table? – Sreeraj Jan 04 '15 at 09:29
1

I think now it is pretty clear. – Pavel Šimerda Jan 04 '15 at 09:49
1

and to prevent the pid being reused. The parent has the pid of the child and may signal the child (may try to kill it), just from the pid it recieved when it created the child. It would be bad for the pid to be reused. Therefore the child remains in a zombie state until the parent acknowledges the death of the child, or the parent dies. – ctrl-alt-delor Jan 04 '15 at 11:54
1

@sree Because it is already dead: When a process dies it becomes a zombie (almost all resources are freed, except pid and exit status). This is true of all processes, however they usually don't stay in this state for long. It they no longer have a parent, then they are adopted by init (pid=1). Then when the parent (or init) acknowledges the death of the child, the process is reaped and is no longer in the process table. – ctrl-alt-delor Jan 04 '15 at 12:00
@richard Incorporated info from your comments, feel free to make another edit. I didn't include the pid part though, as I don't think that applies. You can easily avoid it by only assigning incrementally, which I think is the case already. – Pavel Šimerda Jan 04 '15 at 12:41
On a system that creates a lot of processes, the next pid will wrap around (PIDs will be reused). Imagine a parent process with a window and a red stop button. It starts several child processes. The user presses the red button, so the parent send a kill to all children (based on a list of all pids). While half way through this very long list of children. One of the children finishes (this is ok, you can not kill something that is already dead), but then another process starts and gets same pid. Could easily happen on system where there are lot of processes being created, or parent is slow. – ctrl-alt-delor Jan 04 '15 at 19:58
Does it happen in practice, or is it just a theoretical problem? Also isn't it much slower to allocate old pid values than to allocate unused ones incrementally? – Pavel Šimerda Jan 04 '15 at 20:24

Why cannot we kill a zombie?

1 Answers1