The whole point of piping commands is to run them concurrently with one reading the output of the other. If you want to run them sequentially, and if we keep the plumbing metaphor, you'll need to pipe the output of the first command to a bucket (store it) and then empty the bucket into the other command.
But doing it with pipes means having two processes for the first command (the command and another process reading its output from the other end of the pipe to store in the bucket), and two for the second one (one emptying the bucket into one end of the pipe for the command to read it from the other end).
For the bucket, you'll need either memory or the file system. Memory doesn't scale well and you need the pipes. The filesystem makes much more sense. That's what /tmp
is for. Note that the disks are likely not to ever see the data anyway as the data may not be flushed there until much later (after you remove the temp file), and even if it is, it will likely still remain in memory (cached). And when it's not, that's when the data would have been too big to fit in memory in the first place.
Note that temporary files are used all the time in shells. In most shells, here documents and here strings are implemented with temp files.
In:
cat << EOF
foo
EOF
Most shells create a tempfile, open it for writing and for reading, delete it, fill it up with foo
, and then run cat
with its stdin duplicated from the fd open for reading. The file is deleted even before it filled up (that gives the system a clue that it whatever is written there doesn't need to survive a power loss).
You could do the same here with:
tmp=$(mktemp) && {
rm -f -- "$tmp" &&
cmd1 >&3 3>&- 4<&- &&
cmd2 <&4 4<&- 3>&-
} 3> "$tmp" 4< "$tmp"
Then, you don't have to worry about clean-up as the file is deleted from the start. No need for extra processes to get the data in and out of buckets, cmd1
and cmd2
do it by themselves.
If you wanted to store the output in memory, using a shell for that would not be a good idea. First shells other than zsh
can't store arbitrary data in their variables. You'd need to use some form of encoding. And then, to pass that data around, you'd end up duplicating it in memory several times, if not writing it to disk when using a here-doc or here-string.
You could use perl
instead for instance:
perl -MPOSIX -e '
sub status() {return WIFEXITED($?) ? WEXITSTATUS($?) : WTERMSIG($?) | 128}
$/ = undef;
open A, "-|", "cmd1" or die "open A: $!\n";
$out = <A>;
close A;
$status = status;
exit $status if $status != 0;
open B, "|-", "cmd2" or die "open B: $!\n";
print B $out;
close B;
exit status'
sponge
and other ways of buffering stdout (not duplicate, because of "cmd2
doesn't start running untilcmd1
has completely finished") – Michael Homer Aug 24 '17 at 04:26false | sponge | echo ok
would still outputok
with a zero return value. – Cyker Aug 25 '17 at 00:17cmd1 | cmd2
, the two processes are started concurrently, always. Synchronization between them happen only through I/O, i.e.cmd2
may want to read something and blocks untilcmd1
has outputted something. – Kusalananda Aug 26 '17 at 06:47