1

I would like to find a Ubuntu Linux 16.04 systen command similar to strace to find out why my C++ program , ServiceController.exe , which

[execle  ("/usr/lib/mono/4.5/mono-service","/usr/lib/mono/4.5/mono-service",
         "./Debug/ComputationalImageClientServer.exe", 
          0, char const* EnvironmentPtr)]

mysteriously stop running after 90 seconds * where * ComputationalImageClientServer.exe and ComputatationalImageClientServer.exe are C#/.NET 4.5 executables

In contrast, when I run /usr/lib/mono/4.5/mono-service.exe  ./Debug/ComputatationalImageVideoServer.exe" at the command prompt,

it runs continually for 24 hours by 7 days at least.

Why cannot the first example run continuously 24X7? How might I diagnose, debug and fix this error?

open("Delaware_Client_Server.exe", O_RDONLY) = 3
pipe2([4, 5], O_CLOEXEC)                = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f743e4dca10) = 3509
close(5)                                = 0
fcntl(4, F_SETFD, 0)                    = 0
fstat(4, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
read(4, "", 4096)                       = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3509, si_uid=1000, si_status=1, si_utime=0, si_stime=0} ---
close(4)                                = 0
wait4(3509, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 3509
write(1, "\n", 1)                       = 1
write(1, "Process returned 256\n", 21)  = 21
Frank
  • 681
  • Is there a difference between 24x7 and continuous that I am not aware of? – Anthon May 28 '16 at 11:26
  • @Anthon, Thank you for your comment and nice edit. There is no difference between 24x7 and continuous for us. I am curious to find out why my C++ program , ServiceController.exe , which [execle ("/usr/lib/mono/4.5/monoservice", "/usr/lib/mono/4.5/mono-service", "./Debug/ComputationalImageClientServer.exe", 0, char const* Envp)] mysteriously stops running after 90 seconds. – Frank May 28 '16 at 14:15
  • @Anthon, I forget to say that the executable , "./Debug/ComputationalImageClientServer.exe", is a C#/.NET 4.5 which reads and writes a MySQL database. Thank you, – Frank May 28 '16 at 14:23
  • Try out strace first. Maybe it dies due to signal. –  May 28 '16 at 15:45
  • @siblynx, Thank you for your message. What signals could cause it to die? – Frank May 29 '16 at 01:14
  • strace can show how process ended it's lifecycle. –  May 29 '16 at 05:34
  • @siblynx , What is the relationship between strace ouput data and csharp , c# , program statements? Thank you. – Frank May 29 '16 at 10:19
  • @siblynx, You are absolutely correct. The culprint is SIGCHLD. What should I do next? – Frank May 31 '16 at 13:08
  • See comments under your own answer. –  May 31 '16 at 13:37

2 Answers2

2

Use the GNU debugger, gdb, or something similar.

ojs
  • 932
  • @ojs, Thank you for your answer. How could gdb inspect a C# program? I know that Monodevelop can debug C# programs if attached properly which I do not know how to do in this case. – Frank May 28 '16 at 14:26
  • Sorry, but I can't be any more specific on my answer. I don't know that much about Mono. Perhaps asking in stackoverflow would be more appropriate since this is kind of a programming question. But I think you have to get a debugger to solve this. – ojs May 28 '16 at 14:50
0

What's the difference between running a program as a daemon and forking it into background with '&'?

states that perhaps SIGHUP is the culprit signal which causes ServiceController.exe to stop running after 90 seconds. nohup command & prevents this from happening.

With command & Your process will be killed by a SIGHUP signal when the parent dies.

Sysadmins have access to some workarounds, though.

On a bash system, you can use: (trap '' HUP; command) &

This opens a subshell, traps the HUP signal with an empty handler and ampersand/forks it.

Output might still get redirected to the wrong tty. Or get lost. You may fix that with &>command.out, 1>output.out, or 2>errors.out

You might also have access, on most systems, to the nohup command. nohup simplifies this process greatly. It's quite standard, but I found many busybox embedded ARM distributions missing it. You just write: nohup command &

..and you're done. Output gets redirected, IIRC, to nohup.out, but this filename can be changed with an option.

Frank
  • 681
  • @siblynx, Could you look at this answer and see if it is plausible? Thank you. – Frank May 30 '16 at 07:13
  • On my systems I do not care and, if I do not need any output from process I just force it go daemon mode. I dislike & shell operand because process spawned this way still have controlling tty and other attributes which I usually do not need at all. If I need output from process, I just spawn a subshell in a fork sh -c 'cmds &>out.log' way. And my processes do not get HUPed because they are already lose their ttys and obtain only raw IO fds allocated by shell or fork. fork.c –  May 31 '16 at 13:28
  • Now about SIGCHLD. If your process receives this then there is a reason to. But, if you really cannot change the situation (you have no source code for the program you running?), maybe you will try this program to forcibly ignore signals? How about sigign? sigign.c You specify numeric values of signals to ignore and then specify cmdline to run with ignored signals. See sigign.README –  May 31 '16 at 13:36
  • But I still think you should try to see what changes are there after 90s timeout when your process receives CHLD. Specifically what process did exited or what else raised CHLD signal. –  May 31 '16 at 13:47
  • @siblynx, I just added the strace output at the bottom of the original question for everyone to see. Evidently SIGCHLD is not being masked. Thank you. – Frank May 31 '16 at 14:21
  • So child returned exit code which parent did not like? That's probably the cause why your program terminates. It terminates due to internal condition. –  May 31 '16 at 15:56
  • @siblynx, Thank you for all of your help. I am trying to block SIGCHLD inside a C#.NET4.5 executablle. Does that make a diffference. I will be back here in an hour waiting to hear from you. – Frank May 31 '16 at 15:58
  • Well if something will call signal(SIGCHLD, &sa_handler) again then SIGCHLD will not blocked and instead it will be again caught or otherwise processed. But from strace lines you've posted I see that the program does the decision about child status. But output is incomplete - it does not tell full history of parent down to exit_group. –  May 31 '16 at 16:03
  • And normally, SIGCHLD is simply ignored (see man 7 signal), so the decision about termination is indeed made by your running program. Do you have the source code? You can of course block SIGCHLD, but I do not think it will lead to normal program behavior. You should really trace why your program terminates and why it's child returns an error to it. –  May 31 '16 at 16:13
  • @siblynx. I totally agree with the fact that the output I showed you is not complete. In fact the parent process log reveals that it cannot connect to the MySQL database and it halts the startup of the recorder. Thank you. – Frank May 31 '16 at 17:43
  • Good! Edit your own answer then. –  Jun 01 '16 at 02:46