17

I have a Java process running on a RedHat Linux instance.

The problem is it keeps reappearing after I kill it. I am not sure where to look. I've already went to crontab, but no luck.

I've looked at the PPID, but it points to init (1).

Any idea how I can find out the source?

Jose
  • 275
  • 1
  • 3
  • 8
  • 1
    Can you give us something to go on? Does the process write to any files for example? Can you show us the output of ps xf showing the process tree? As it stands, we have very little to go on. – terdon Oct 22 '14 at 18:55
  • You said you went to crontab... Have you also checked at to see if any of those is the one?. – YoMismo Oct 23 '14 at 06:56
  • Can you tell us what java software you are actually running. I have seen tools like Cassandra, that actually have a built-in watchdog in certain setups that just fires up another instance of the database once the first instance failed (was not gracefully stopped). – Matthias Steinbauer Oct 23 '14 at 08:54

4 Answers4

15

There are a number of possibilities (some mentioned in other answers):

  1. A system or user cronjob executing often,
  2. In SysV init, an /etc/inittab entry for the service with the respawn directive,
  3. In systemd, a unit file with the Restart option set to a value other than no,
  4. In Upstart, a service configuration file with the respawn directive,
  5. A process monitoring tool such as monit, or
  6. An ad-hoc watchdog process for that particular service.

An interesting new (linux-only) tool that could provide more insight into where the process is being started is sysdig.

Sysdig uses the Linux Kernel's tracepoint features to provide what amounts to a fast, system wide strace.

For example, if I wanted to see every process starting ls, I can issue:

sudo sysdig evt.type=execve and evt.arg.exe=ls

When ls is run somewhere, I will get a message like this:

245490 16:53:54.090856066 3 ls (10053) < execve res=0 exe=ls args=--color=auto. tid=10053(ls) pid=10053(ls) ptid=9204(bash) cwd=/home/steved fdlimit=1024 pgft_maj=0 pgft_min=37 vm_size=412 vm_rss=4 vm_swap=0 env=...

I truncated the environment information returned, but as you can see, in the ptid I can see the name and pid of the program calling execve. execve is the system call used in Linux used to execute new commands (all other exec calls are just frontends to execve).

Steven D
  • 46,160
  • 2
    sysdig is a great advice! BTW, it's now available for Windows (and Mac, I think) with limited functionality. – Neowizard Aug 09 '15 at 18:22
  • How does monit help here? I started reading through the manual but it looks like an alternative or backup to something like Nagios. I'm not seeing how it would help you track down a respawning process. – Jefferson Hudson Mar 08 '19 at 22:37
11

I believe you could use pstree. You could specify the command as,

pstree -p PID

The above will give you a list of all parents of the java applications.

Ramesh
  • 39,297
8

You could have a look at its PPID (parent process ID) :

$ ps -eo pid,ppid,args | grep java

Once you've got the PPID (second column) of your Java process, use ps again to find the associated process:

$ ps -p [PPID]

Edit : if the parent is 1 (init), then the first parent of your Java process died right after "giving birth" (how sad). Because of that, you can't use the current process hierarchy to find it. The first thing I would recommend you to do is to check ps -ef. You might find the culprit just by reading the output.

Then, have a look at crontabs (you did it already, but it won't hurt) :

$ for user in $(cut -f1 -d: /etc/passwd); do echo $user; crontab -u $user -l; done

This will require root privileges.

Still can't see a Java process scheduled? Dang it. Let's try something else. If your Java process is present since boot, have a look at programs scheluded at boot time. I would suggest something like...

$ grep -iR java /etc/rc*

If you still can't find anything then... Well I admit I'm running out of ideas. You should really have another look at ps -ef, and locate processes associated with Java-based programs. You should come across a daemon, or a "launcher", responsible for the constant respawning of your Java process.

John WH Smith
  • 15,880
  • I've tried looking up the parent process, but it just points to init (PPID = 1). I'll modify question with this info. – Jose Oct 22 '14 at 14:39
  • @JoseChavez, if your PPID is 1, then the java processes that are getting created are zombie processes.Check this answer here. – Ramesh Oct 22 '14 at 14:43
  • @JoseChavez I edited my answer with a few more tracks to investigate in your case. – John WH Smith Oct 22 '14 at 14:52
  • 2
    @Ramesh If the PPID is 1, they may or may not be zombies. If they were not actually spawned by init, they are at least orphans. The state specifier to ps will show if they are zombies (e.g., ps -eo pid,ppid,state,comm); the state will be Z. – goldilocks Oct 22 '14 at 14:56
  • @goldilocks, thanks for the clarification. I appreciate it. :) – Ramesh Oct 22 '14 at 14:58
  • 1
    @goldilocks: If the PPID is 1, they are not zombies, unless the init process is malfunctioning; it ought to run a wait loop that reaps all orphaned zombies immediately. – hmakholm left over Monica Oct 22 '14 at 17:33
  • @Ramesh: You probably meant "daemons" rather than "zombies". – hmakholm left over Monica Oct 22 '14 at 17:34
  • @HenningMakholm I did not mean daemons but right you are -- an orphan cannot become a zombie because it has an (adoptive) parent that will not mishandle it! Although there looks to be occasional freak exceptions to this: http://unix.stackexchange.com/questions/11172/how-can-i-kill-a-defunct-process-whose-parent-is-init – goldilocks Oct 22 '14 at 17:41
1

If you don't know who is the parent, you should you some system tracer like auditd

you'd enable logging with:

auditctl -a exit,always -S execve -F path=/usr/bin/rrdtool

and then in /var/log/audit/audit.log find lines like:

type=SYSCALL msg=audit(1414027338.620:6232): arch=c000003e syscall=59
success=yes exit=0 a0=7fdea0e4db23 a1=7fffec7c5220 a2=7fffec7c87d0
a3=7fdea1b559d0 items=2 ppid=17176 pid=18182 auid=1000 uid=1000 gid=1000 
euid=1000 suid=1000 fsuid=1000 egid=1000 sgid=1000 fsgid=1000 tty=pts8 
ses=2 comm="sh" exe="/bin/dash" key=(null)

(broken into multiple lines for readibility). You're interested in exe="/bin/dash" and/or pid=18182 which identify your rouge process you want to look for, and ppid=17176 which identifies parent which executed it.

Matija Nalis
  • 3,111
  • 1
  • 14
  • 27