Parallel execution of remote commands

Question

The setup:

I have a PHP script (currently written for PHP5.5, but the server it runs from has 7.4) that processes files that contain lists of linux servers, then runs a bash or perl script via ssh on that remote server in the following fashion:

exec("ssh -o StrictHostKeyChecking=no -p $connectivity_port $user@$server \"bash -s\" -- < $file $scriptargs 2>&1", $result, $exit_code);

This all works perfectly fine, but it takes a while depending on the code being run. Since locally there's almost nothing to process except the output of the scripts (there is a lot of logging and some scripts output to files local the server the PHP is run from).

The Goal

I was wondering what the best/easiest method/tools would be, running from bash, to run the PHP script in parallel, making sure everything output is in the order of the servers in the lists (say, x servers at a time, maybe 10, to drop the execution time down)

PHP itself does not seem to be the way to go from my research and version limitations, and bash seems to also not fit the bill, but I am open to being wrong, and willing to learn other methods.

You could exec call a bash shell script which uses GNU parallel. https://net2.com/how-to-execute-commands-in-parallel-in-linux/. (Check Example 3) — Michael D., Nov 29 '21 at 23:11
Looking through all these answers, going to try them over the next few days, surprised at how many options there are — solenoid, Nov 30 '21 at 15:46
@HaukeLaging https://unix.stackexchange.com/questions/11376/what-does-double-dash-mean , which in this context is to prevent anything from the script being passed as an option — solenoid, Nov 30 '21 at 15:55

Hauke Laging · Answer 1 · 2021-11-30T10:33:55.393

1

In bash you would do something like this:

declare -r MAX_PARALLEL='5' WAITSEC='0.1'
i=0
server[i]=...
port[i]=...
user[i]=...
command_file[i]=...
scriptargs[i]=...
((i++))
server[i]=...
port[i]=...
user[i]=...
command_file[i]=...
scriptargs[i]=...
((i++))
count=$i
for((i=0;i<count;i++)); do
    while [ $(jobs -r | wc -l) -gt "$MAX_PARALLEL" ]; do
        sleep "$WAITSEC"
    done
    ( ssh -o StrictHostKeyChecking=no -p "${port[i]}" "${user[i]}@${server[i]}" 

"bash -s" <"${file[i]}" "${scriptargs[i]}" >output_file.$i 2>&1
      echo $? >exit_code.$i ) &
done

Unfortunately there seems not to be a trivial way for getting the correct number of jobs so this only works correctly if no command line contains a newline.

edited Nov 30 '21 at 10:33

answered Nov 29 '21 at 23:10

Hauke Laging

90,279

It will run on all servers at the same time without limiting the concurrency, so in case there are many servers it might hit some load/network/CPU bottlenecks. – aviro Nov 30 '21 at 06:55
@aviro I didn't notice that requirement in the question. I have adapted my answer. – Hauke Laging Nov 30 '21 at 10:34

score 0 · Answer 2 · answered Nov 30 '21 at 00:18

I have something like that running, of all things, on a Synology DS218.

In my case, the PHP script prepares a bash script with the various commands, then executes the script.

This can work this way because in my case

all servers are separated (I won't overload any of them)
an error in server 12 does not mean stopping and skipping servers past 12

If these requisites weren't satisfied, I'd have to do it differently.

But as long as they are,

#!/bin/bash
ssh server1 "command1" > output1 2> error1 &
ssh server2 "command2" > output2 2> error2 &
...
ssh serverN "commandN" > outputN 2> errorN &
wait for all SSHs to complete
wait

At the end, all output files are reaped in numeric order, and deleted.

aviro · Answer 3 · 2021-11-30T15:54:39.320

I can offer two ways to do that.

xargs

Assuming you have a file that contains the list of hostnames separated by newlines, and that the user and port for all connections, you could use xargs.

xargs -I '{}' -P <max-procs> --arg-file <INPUTFILE> bash -c "ssh -o StrictHostKeyChecking=no -p $connectivity_port $user@{} 'bash -s' < $file $scriptargs > $OUT_FOLDER/{}.log 2>&1"
or
cat <INPUTFILE> | xargs -I '{}' -P <max-procs> bash -c "ssh -o StrictHostKeyChecking=no -p $connectivity_port $user@{} 'bash -s' < $file $scriptargs > $OUT_FOLDER/{}.log 2>&1"

You can set up concurrency with the -P flag.

       --max-procs=max-procs
       -P max-procs
              Run up to max-procs processes at a time; the default is  1.   If
              max-procs  is 0, xargs will run as many processes as possible at
              a time.  Use the -n option with -P; otherwise chances  are  that
              only one exec will be done.

It will write the output of each command to $OUT_FOLDER/$HOST.log.

If you have different user and port for each machine you can still use xargs, but that would be a bit more complex.

pdsh

Another option is to use pdsh which can "issue commands to groups of hosts in parallel".

pdsh -R exec -w^<INPUT FILE> -f <max-procs> bash -c "ssh -o StrictHostKeyChecking=no -p $connectivity_port %u@%h 'bash -s' < $file $scriptargs 2>&1"

The -f here is similar to the -P flag in xargs.

exec    Executes an arbitrary command for each target host. The first of the pdsh remote arguments is the local command
        to execute, followed by any further arguments. Some simple parameters  are  substitued  on  the  command  line,
        including  %h  for  the target hostname, %u for the remote username, and %n for the remote rank [0-n] (To get a
        literal % use %%).  For example, the following would duplicate using the ssh module to run  hostname(1)  across
        the hosts foo[0-10]:
      pdsh -R exec -w foo[0-10] ssh -x -l %u %h hostname

   and this command line would run grep(1) in parallel across the files console.foo[0-10]:

      pdsh -R exec -w foo[0-10] grep BUG console.%h


-f number
       Set the maximum number of simultaneous remote commands to number.  The default is 32.

If will dump the output of the commands prefixed with HOSTNAME:

Here's an example.

$ pdsh -R exec -w host1,host2 bash -c "ssh  -o StrictHostKeyChecking=no -p 22 %u@%h 'bash -s' <<< 'echo Running script on %h with arguments: \${@}' arg1 arg2 arg3"
host1: Running script on host1 with arguments: arg1 arg2 arg3
host2: Running script on host2 with arguments: arg1 arg2 arg3

score 0 · Answer 4 · answered Nov 30 '21 at 12:06

You could use Perl with Parallel::ForkManager and IPC::Open2.

Usage:

cat list_of_servers.txt | perl para.pl /path/to/script.sh ARG1 ARG2

Code of para.pl:

#!/usr/bin/env perl
use v5.20;
use IPC::Open2 qw(open2);
use Parallel::ForkManager qw();
sub run_script_on_server {
    my ( $server, $script, @args ) = @_;
    say "$$ running script: $script on server: $server with args: @args";
    # TODO: replace with ssh invocation
    my $pid = open2( my $chld_out, my $chld_in, "bash", $script, @args );
    local $/ = undef;
    return <$chld_out>;
}
my $pm = Parallel::ForkManager->new(10);    
while ( my $server = <STDIN> ) {
    $pm->start and next;
    chomp $server;
    my $result = run_script_on_server( $server, @ARGV );
    say "$$ result from $server: $result";
    $pm->finish;
}

Parallel execution of remote commands

The setup:

The Goal

4 Answers4

wait for all SSHs to complete

xargs

pdsh