18

After learning (more or less) some useful discussions about pipes like Get exit status of process that's piped to another and Exit when one process in pipe fails I still can't avoid starting the second command when the first command fails. Am I missing a fundamental detail about pipes?

So for example

$ somecommand | tar -T - -czf /tmp/someProject.tar.gz

shouldn't create the almost empty tar.gz file if somecommand didn't work properly and produced just a few error messages instead of the expected file list.

AdminBee
  • 22,803
stoqlt
  • 181
  • 4
    Are you open to workarounds that will still create the tarball but delete it immediately in the case where the command had failed? Something like set -o pipefail; somecommand | tar -T - -czf /tmp/someProject.tar.gz || rm /tmp/someProject.tar.gz? – terdon Jul 09 '21 at 12:35
  • 3
    Possible approach: ifne. – Kamil Maciorowski Jul 09 '21 at 12:54
  • So I shouldn't and won't look forward to the time. @terdon 's suggestion, however, is a nice solution. Thank you for all of your comments – stoqlt Jul 09 '21 at 13:28

5 Answers5

28

Yes, there is a bit of a fundamental detail about pipes there.

The point of a pipeline is to run the two or more commands in parallel, which avoids having to store all the data in full and can save time in that all processes can work at the same time. This by definition means that the second command starts before the first exits, so the exit status for the first isn't available yet.

The simple workaround is to use a temporary file instead. It shouldn't be much of a problem with storage here since we're passing just the list of file names, and not the data itself. E.g.:

tmp=$(mktemp)
if somecommand > "$tmp"; then
    tar -T - -czf /tmp/someProject.tar.gz < "$tmp"
fi
rm -f "$tmp"

Or indeed like terdon comments, just let the tar run, and remove the tar file afterwards if somecommand failed. But if somecommand produces a partial but significant list of files before failing, that can still cause some amount of unnecessary I/O when creating the to-be-removed archive.

Also, at least in GNU tar, by default -T does some processing of quotes and lines that look like command line options, so if you have nasty filenames, you may need to take that into account, or look into --verbatim-files-from, or --null. Similar issues might exist with other tar implementations.

ilkkachu
  • 138,973
12

It seems as if you are misunderstanding the nature of piped commands.

In a pipeline of commands, all commands are started in parallel (see here e.g.). That is why in your construct there is no way to have tar "wait" for successful completion of somecommand, because the tar reads the output that somecommand creates as it goes.

There are a few workarounds you could apply to alleviate the situation:

  • buffer the output of somecommand in a temporary file and only run tar if the exit status of somecommand signals "success", and remove it afterwards (only feasible if you have sufficient storage space)
  • as mentioned by @terdon, use the pipefail option to remove the unusable tar.gz file if somecommand had failed.
AdminBee
  • 22,803
2

The easiest way would be to use &&. Since you also want to capture the output, you can redirect it before and use it later.

somecommand > somefile && tar -T somefile -czf /tmp/someProject.tar.gz

If somecommand exits with anything other then exit 0, the tar will not be executed.

-2

While other answers are absolutely correct, a hackaround would look like

my-command | (sleep 1; tar cvf - output)

if my-command dies prematurely within 1 second, second pipe will get SIGPIPE and probably die too . You can obviously tune your buffer to nanosleep perl etc.

unixux
  • 5
  • SIGPIPE is sent to the writer, not the reader, of a pipe. All you get in this example is an empty, or almost empty, stdin, no matter if you sleep a second or not. – Guntram Blohm Jul 11 '21 at 15:14
  • apart from the sleep not helping, the command is messed up here. tar f - would also tell it to write the archive file to stdout, which here is connected to the terminal. That's usually not very useful, and at least GNU tar refuses to do it. Without -T, tar would look for filenames on the command line, so here, output would be taken as the name of a file to add to the archive. – ilkkachu Jul 14 '21 at 10:59
-4

A pipe is not the way to do this. The && construct executes the second command only when the first command succeeds.

E. g. in a directory where there is no file named "goober":

$ ls goober | echo "done"
done
ls: cannot access 'goober': No such file or directory
$ ls goober && echo "done"
ls: cannot access 'goober': No such file or directory

In your example:

$ somecommand && tar -T - -czf /tmp/someProject.tar.gz

if somecommand fails, then tar will not run.

Wastrel
  • 151
  • 4
    In the question data flows from somecommand to tar and it's important it does. Your "solution" breaks this connection. – Kamil Maciorowski Jul 10 '21 at 13:45
  • If somecommand succeeds, then tar will run as intended. The OP was complaining about tar running and creating an empty file because it had nothing to do when somecommand failed. – Wastrel Jul 10 '21 at 13:49
  • 3
    My point is the OP wants tar to read the output of somecommand. That's why this other answer uses a regular file. Your answer passes nothing to tar (it will read from the terminal or whatever the stdin of the shell is; it will not read from somecommand). For now it's more useless than the code from the question because your tar will never do what the OP wants. – Kamil Maciorowski Jul 10 '21 at 14:00
  • Yes, he's going to have to change his tar command to specify what to tar -- which will be the output of somecommand. – Wastrel Jul 11 '21 at 13:40
  • 1
    ... only you will then need to store the output of somecommand somewhere, which brings us back to using a temporary file, as in the accepted answer. – AdminBee Jul 12 '21 at 07:16
  • Sure. And if somecommand is scp from another machine or wget from somewhere, that would be perfectly normal. But we don't know what somecommand does. ¯_(ツ)_/¯ – Wastrel Jul 13 '21 at 13:35
  • 2
    Well, tar -T - means to get names of files to extract or add from stdin, so we can infer somecommand produces some list of filenames. With ... && tar -T - -czf foo.tar.gz, the stdin of tar would still be connected to the terminal, and it would block waiting for the user to enter the file names. I guess you could then copypaste them from what somecommand printed, but that's not very user friendly, and it's actually exactly what pipes are for... – ilkkachu Jul 14 '21 at 10:39