2

Testing how long will it take to pass 50000 arguments from php to a bash script, turns that I cannot pass even 1000 arguments from php to a bash script at once, unless I can?

PHP:

$array = fetch_results_from_working_pool_temp_table (); $outputfile = "/var/www/html/outputFile"; $pidfile = "/var/www/html/pidFile"; $id = ""; $array_check=array();

foreach ( $array as $row => $column ) { $id .= $column ['id']; $id .= " ";

} $cmd = "sudo /bin/bash /var/www/html/statistics/pass_all.sh {$id}"; exec( sprintf ( "%s >> %s 2>&1 & echo $! >> %s", $cmd, $outputfile, $pidfile ) );

bash:

#!/bin/bash for ip in "$@" do echo "${ip}" done

So my php passes arguments to the bash, bash prints to the outputFile along with any errors. pidfile will hold pid of the process that was launched with this exec. The command is not even being executed because I see no process launched. Is there any limit for passed arguments in exec? or from PHP or in Linux shell? I am running php 5.4 and Linux Redhat 7 I want to run processes using GNU parallel but because PHP is single-threaded (there are libraries to pass this but I would prefer to avoid that). Maybe I could pass it somehow to a text file and exec to a script that pulls from this text file? Help!

**Update: my machine limits:**
#getconf ARG_MAX
2097152

#ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 256634 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 4096 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited

dwt.bar
  • 145
  • 1
  • 10

3 Answers3

2

On most systems, the kernel limits the size of the arguments to the execve() syscall (command line args + environment vars). On Linux, the limit is related to the maximum stack size, though usually you'd get at least 2 MB total for the default stack size limit of 8 MB. It also limits a single argument to 128 kB, see e.g. Is there a maximum to bash file name expansion (globbing) and if so, what is it? and Raise 128KiB limit on environment variables in Linux

If PHP runs sh -c 'command line' when you call exec("command line") then the argument to -c could well exceed that 128 kB limit. The fact that the command line gets subsequently split into distinct words by the shell wouldn't help.

ilkkachu
  • 138,973
  • That was very helpful, would you happen to know simplest way to go around it? – dwt.bar Dec 01 '20 at 19:45
  • 1
    @dwt.bar, well, actually, no. I tried to look it up, and found this question on SE: Run executable from php without spawning a shell. It doesn't seem to indicate it would be too easy – ilkkachu Dec 01 '20 at 20:59
  • Thanks @ilkkachu. I believe that instead of handing over too many arguments to a new process, make use of a shell built-in (echo) or iterate over the arguments with a control structure (for loop). *Without calling exec, there is no ARG_MAX limitation. So it would explain why shell builtins are not restricted by ARG_MAX. So maybe I could save all the arguments to a file and read line by line in my shell script that I try to pass via exec() in PHP. source: http://shaoguangleo.github.io/2017/03/23/linux-command-line-arg-max/ – dwt.bar Dec 01 '20 at 21:10
  • 1
    @dwt.bar, saving the data to a temporary file is probably smart – ilkkachu Dec 01 '20 at 21:38
1

When you have this many arguments, you want to pass them to GNU Parallel via standard input (stdin) or via files.

I would do something like (untested):

$f = popen("parallel","w");
fwrite($f,$commands);
close ($f);

This way you may be able to avoid the temporary file.

Ole Tange
  • 35,514
0

So with all your help here is my solution: PHP:

function scan_targets() {

$targetsFile= "[absolute path to the file]/targets";

$array_with_targets = fetch_from_db (); //function that gets me all the targets $outputfile = "[absolute path to the file]/outputFile"; //output from parallel script $pidfile = "[absolute path to the file]/pidFile"; //PID of the process $target = "";

foreach ( $array_with_targets as $row => $column ) { $id .= $column ['id']; $id .= " "; } file_put_contents($targetsFile, $ip) ; $cmd = "/bin/bash [absolute path to the file]/pass_targets.sh"; exec( sprintf ( "%s >> %s 2>&1 & echo $! >> %s", $cmd, $outputfile, $pidfile ) );

BASH:

#!/bin/bash

#appending arguments to the command targets_array=() IFS=" " while read -r field || [ -n "$field" ]; do targets_array+=("$field") done <[absolute path to the file]/targets

parallel bash [absolute path to the file]/check.sh ::: $targets_array

You can also run your parallel with -Dall option to have more context what's happening, I managed to scan almost 40.000 hosts in 7h. Web server added all the targets to the file in seconds and as my exec launches a background process used does not have to wait for the result (we are outputting it to the file).

And the check.sh script is also updating Mariadb database record for a particular target as it goes.

dwt.bar
  • 145
  • 1
  • 10