2

I am working on a remote server, on which I do numerical simulation using the OpenFOAM computational fluid dynamics library. I built a collection of Python scripts to automate my parametric studies and they seem to be working well.

I connect to the server using SSH, and I launch the scripts in an interactive shell. Sometimes, under conditions I'm still unable to identify, the server closes the SSH session. I currently use the screen window manager as a workaround, but still seems like an issue to me. Here is an exemple of output I get:

<lots of output before that>
Dumping up_half1 faces to "final_up_half1.obj"
Dumping cyclic match as lines between face centres to "final_up_half0up_half1_match.obj"
Writing repatched mesh to 0

End

Killing PID 32536
Connection to hpc4 closed by remote host.
Connection to hpc4 closed.
➜  ~

Please note that the simulation is not completed: after the termination of the application which print End on screen, another one should start and perform some treatments.

So here's the question: What could be the cause of such disconnections?

M4urice
  • 121
  • I would contact the server admin, who actually might have a clue, instead of asking for wild speculation. – jw013 Apr 18 '14 at 16:20

2 Answers2

1

You can avoid the problem of server disconnects by using nohup. nohup runs your command on the server and it continues running even if the server disconnects. It saves the stdout of your command to a file called nohup.out, but you can redirect it as you see fit. For example,

nohup ./simulation > output.txt &

will run ./simulation and put the output that would normally print to the screen in output.txt. Even if the ssh disconnects, ./simulation will continue running to completion.

honi
  • 117
  • 1
    It is a workaround, ad it is even inferior to screen. t does not prevent these disconnects from happening... – glglgl Apr 18 '14 at 18:21
1

Wild guess:

There is nothing wrong with your machines in particular, but you are in a network "secured" by a firewall which keeps tracks of tcp connections. When the firewall feels your connection has been idle for too long, it will consider it dead. This means the firewall doesn't think it's a good idea to forward tcp segments that belong to that connection, because, from, the firewall's point of view, there is no connection those segments could possibly belong to ... and your SSH session will time out eventually.

To remedy your situation, you can make your SSH client send an empty segment every now and then, to remind the firewall, that you have an active session on a remote host. You can do that with the ServerAliveInterval option, as described here.

As you are using screen: I once had the same problem, but I accidentally fixed it, when I added a clock to my hardstatus line, which makes screen update the hardstatus line automagically every minute.

An about minimal working ~/.screenrc that accomplishes this should be:

hardstatus alwayslastline
hardstatus string '%=[%Y-%m-%d %c ]'

(adopted from Red Hat Magazine)

Bananguin
  • 7,984