Some of the details in this answer assume that the user uses a shell other than zsh
. The details differ slightly for the zsh
shell due to its MULTIOS
feature.
strings *.bin > bin.txt | sort -n bin.txt > logs1.txt
This runs strings *.bin
and redirects the result to bin.txt
. At the same time as strings
starts, sort
is started and sorts the file bin.txt
. The pipe has no function at all in this pipeline other than allowing the two commands to run concurrently.
Usually, the pipe is used to transfer the standard output of the left side's command to the standard input of the right side's command, but since both commands read from a file, the pipe is never used.
Since both strings
and sort
are started concurrently, sort
may possibly find the end of the bin.txt
file before strings
has finished writing the whole file. It's fairly random how much data sort
will end up reading, if anything.
Correct use of the pipe would have looked like
strings -- *.bin | sort -n > logs1.txt
Here, strings
writes directly to the input of sort
rather than to a file, and sort
reads from the output of strings
instead of from a file.
The right-hand side of the pipe will be temporarily blocked if the left-hand side does not produce data fast enough, and the left-hand side will be temporarily blocked if the right-hand side can't consume the data fast enough. This way, the two utilities are synchronized, and you are guaranteed that sort
will read the full output of strings
.
strings *.bin > bin.txt & sort -n bin.txt > logs1.txt
This suffers from the same issue as the previous command as both strings
and sort
are started concurrently. The &
starts strings
in the background and sort
is started immediately after that. Both utilities write to or read from bin.txt
independently of each other, and it's chance that decides how much of the file is written before sort
reaches its end.
strings *.bin > bin.txt
followed by sort -n bin.txt > logs2.txt
.
Here, you manually synchronize the two utilities by allowing strings
to finish writing to the intermediate file bin.txt
before using sort
to sort its contents. There is no issue, and you are guaranteed that sort
will be able to read the complete output of strings
from the file.
Summary: Your first two commands do not synchronize the strings
and sort
utilities. The writing by strings
is independent of the reading by sort
. This means sort
may find the end of the intermediate file before strings
is finished writing all data. This, in turn, means you may get an incomplete end result. The amount of data your incomplete result will contain depends on chance.
The fact that the two utilities are started concurrently with each other also means that sort
may even read a pre-existing bin.txt
to the end before the shell even has time to truncate the file and start strings
.
Solution: Write all data to the intermediate file first, then read from it, as in your third example. Or, allow the two utilities to communicate the data directly between themselves using the pipe, as in my suggestion above:
strings -- *.bin | sort -n > logs1.txt
Or to keep a copy of the unsorted strings
output for future reference:
strings -- *.bin | tee bin.txt | sort -n > logs1.txt
Further relevant reading here on U&L:
sort foo.txt | sort bar.txt
, the pipe would still be written to, and iffoo.txt
as large enough, the pipeline would hang since nothing was reading from the pipe (small amounts of data would just disappear in the pipe buffers). – ilkkachu Jun 17 '22 at 08:50