12

This sends output to STDERR, but does not propagate Ctrl+C (i.e. Ctrl+C will kill ssh but not the remote sleep):

$ ssh localhost 'sleep 100;echo foo ">&2"'

This propagates Ctrl+C (i.e. Ctrl+C will kill ssh and the remote sleep), but sends STDERR to STDOUT:

$ ssh -tt localhost 'sleep 100;echo foo ">&2"'

How can I force the second to send STDERR output to STDERR, while still propagating Ctrl+C?

Background

GNU Parallel uses 'ssh -tt' to propagate Ctrl+C. This makes it possible to kill remotely running jobs. But data sent to STDERR should continue to go to STDERR at the receiving end.

Ole Tange
  • 35,514

2 Answers2

5

I don't think you can get around that.

With -tt, sshd spawns a pseudo-terminal and makes the slave part the stdin, stdout and stderr of the shell that executes the remote command.

sshd reads what's coming from its (single) fd to the master part of the pseudo-terminal and sends that (via one single channel) to the ssh client. There is no second channel for stderr as there is without -t.

Moreover note that the terminal line discipline of the pseudo-terminal may (and will by default) alter the output. For instance the LF will be converted to CRLF over there and not on the local terminal, so you may want to disable output post-processing.

$ ssh  localhost 'echo x' | hd
00000000  78 0a                                             |x.|
00000002
$ ssh -t localhost 'echo x' | hd
00000000  78 0d 0a                                          |x..|
00000003
$ ssh -t localhost 'stty -opost; echo x' | hd
00000000  78 0a                                             |x.|
00000002

A lot more things will happen on the input side (like the ^C character that will cause a SIGINT, but also other signals, the echo and all the handling involved in the canonical mode line editor).

You could possibly redirect stderr to a fifo and retrieve it using a second ssh:

ssh -tt host 'mkfifo fifo && cmd 2> fifo' &
ssh host 'cat fifo' >&2

But best IMO would be to avoid using -t altogether. That's really only meant for interactive use from a real terminal.

Instead of relying on the transmission of a ^C to let the remote end the connection is closed, you could use a wrapper that does a poll() to detect the killed ssh or closed connection.

Maybe something like (simplified, you'll want to add some error checking):

LC_HUP_DETECTOR='
  use IO::Poll;
  $SIG{CHLD} = sub {$done = 1};
  $p = IO::Poll->new;
  $p->mask(STDOUT, POLLIN);
  $pid=fork; unless($pid) {setpgrp; exec @ARGV; die "exec: $!\n"}
  $p->poll;
  kill SIGHUP, -$pid unless $done;
  wait; exit ($?&127 ? 128+($?&127) : 1+$?>>8)
' ssh host 'perl -e "$LC_HUP_DETECTOR" some cmd'

The $p->mask(STDOUT, POLLIN) above may seem silly, but the idea is to wait for a hang-hup event (for the reading end of the pipe on stdout to be closed). POLLHUP as a requested mask is ignored. POLLHUP is only meaningfull as a returned event (to tell that the writing end has been closed).

We have to give a non-zero value for the event mask. If we use 0, perl doesn't even call poll. So here we use POLLIN.

On Linux, whatever you request, if the pipe becomes broken, poll() returns POLLERR.

On Solaris and FreeBSD, where pipes are bidirectional, when the reading end of the pipe (which is also a writing end there) is closed, it returns with POLLHUP (and POLLIN on FreeBSD, where you have to request POLLIN or else $p->poll() doesn't return).

I can't say how portable it is otherwise outside of those three operating systems.

  • I like your idea, but I cannot make your wrapper detect any signals unless '-tt' is set. This works: parallel --tag -j1 'ssh -tt localhost perl/catch_wrap perl/catch_all_signals & sleep 1; killall -{} ssh' ::: {1..31}, but remove the '-tt' and then it does not work. – Ole Tange Jun 03 '14 at 08:47
  • @OleTange The purpose of the wrapper is for SIGHUP to be sent to the remote job when ssh dies (upon ssh connection hang-up). I don't know what your catch_all_signals does, but all it would get is that SIGHUP and only after the ssh connection has dropped (so if it prints anything on stdout, you won't see it). – Stéphane Chazelas Jun 03 '14 at 08:58
  • catch_all_signals logs all signals to a file, and as mentioned it works with '-tt', but fails without. In other words: It does not receive a SIGHUP from catch_wrap when ssh dies. – Ole Tange Jun 03 '14 at 09:15
  • Still only works with -tt after your edit. Please remember if you do not run the command through parallel, ssh will inherit the terminal you run it from. – Ole Tange Jun 04 '14 at 11:03
  • @OleTange, I can't reproduce, it works for me, have you tested it with the code I posted? Please post your catch_wrap and catch_all_signals somewhere so I can have a look. With -t, I expect it not to work. – Stéphane Chazelas Jun 04 '14 at 11:08
  • I can now get it working on Linux/Mint, but it fails on Solaris and other platforms: https://gist.github.com/ole-tange/e1b89c4c9419a5e52154 Ideas? – Ole Tange Jun 21 '14 at 19:39
  • I am now trying substituting the signal handler ($done=1) with 'exit'. It seems to work on Solaris. – Ole Tange Jun 21 '14 at 20:23
  • I have now worked a bit further on this. It seems to only work for Linux, Sco SYS V and Unixware.

    Correct behaviour: suse, debian, mandriva, scosysv, ubuntu, unixware, redhat, raspberrypi

    Finished, remote sleep not killed: tru64, hurd, miros, freebsd, openbsd, netbsd, qnx, dragonfly

    Finished, other: minix, ultrix

    Not finished: solaris, centos, openindiana, irix, aix, hpux

    With exit instead of $done=1. Exit value is wrong for all. All finished

    Finished, sleep killed: centos

    Finished, sleep not killed: irix, aix, openindiana, solaris, hpux

    – Ole Tange Jul 15 '14 at 22:36
1

To make it work on other platforms this became the final solution. It checks if the ssh client disconnected and thus the parent became pid 1:

$SIG{CHLD} = sub { $done = 1; };
$pid = fork;
unless($pid) {
    # Make own process group to be able to kill HUP it later
    setpgrp;
    exec $ENV{SHELL}, "-c", ($bashfunc."@ARGV");
    die "exec: $!\n";
}
do {
    # Parent is not init (ppid=1), so sshd is alive
    # Exponential sleep up to 1 sec
    $s = $s < 1 ? 0.001 + $s * 1.03 : $s;
    select(undef, undef, undef, $s);
} until ($done || getppid == 1);
# Kill HUP the process group if job not done
kill(SIGHUP, -${pid}) unless $done;
wait;
exit ($?&127 ? 128+($?&127) : 1+$?>>8)
Ole Tange
  • 35,514