Piping errors to my function causes command to be ignored after 1 failure

Question

Context: I have a Bash Script that copies files

function log () {
    read IN
    if [ "$IN" == "" ]; then
        :
    else

        echo "$datetime"$'\t'"$IN" | tee -a logfile
    fi
}

function copy () {
    command cp -L --parents $@
}

...
copy -R /etc . 2>&1 | log
...

Problem: When cp -L --parents -R /etc 2>&1 is run manually, I get about 10 failures (broken Symlinks, expected), and the whole /etc has been copied.

But when the script is run, only 1 failure is reported, and /etc is only copied to the point where the 1 failure happened.

In the attempt to troubleshoot, all I did was to remove 2>&1 form the script and the copying went on as expected.

Question: Is my log function causing the trouble or is this some syntactical issue (albeit not script-breaking) in the way the script is written?

Yes, your copy was killed by a SIGPIPE, when trying to pipe its output to the stdin of echo IN | ..., which does not consume it. Better explain what that log function is supposed to do. — , Mar 23 '20 at 05:29
So to solve this I would need another way to pipe the command/append to the command? such as >>? — Timothy Wong, Mar 23 '20 at 05:31
There are other lines in the script that will also call on log i.e. a "universal" function to log stdout into logfile — Timothy Wong, Mar 23 '20 at 05:34
The log function doesn't make much sense - if the first line is empty after field splitting, you don't write anything, but if it is, you write IN to a file, and in either case you ignore the rest of the input. Is that really what it's intended to do? — muru, Mar 23 '20 at 05:36
Sort of. I actually am adding a timestamp to IN before writing it and that's why I'm doing a check. — Timothy Wong, Mar 23 '20 at 05:52
@mosvy sorry to bother -- but I just remembered that I actually need to write the extra stuff to the IN before writing and that's why I used IN in the first place... Any wqay around it while piping to tee? — Timothy Wong, Mar 23 '20 at 05:54
Related: Prepending a timestamp to each line of output from a command — Kusalananda, Mar 23 '20 at 06:30

score 2 · Answer 1 · answered Mar 23 '20 at 07:48

Your log function is the culprit. What it does is: read one line¹, and if the line is not empty, print a timestamp and then the line content. That's all it does: once it's processed one line, it returns.

When cp emits a first error message, the log function reads it and processes it. Since the log function then returns, the process at the right-hand side of the pipe exits, which causes the reading end of the pipe to close. When cp emits a second error message, it tries to write to a closed pipe, which causes it to die from a SIGPIPE signal. Standard error is line-buffered (by default, and cp doesn't try to change that), so buffering doesn't come into play.

To process all lines of input, you need read in a loop.

log () {
    while IFS= read -r IN; do
        echo "$datetime"$'\t'"$IN"
    done | tee -a logfile >&2
}

I also fixed the read call to IFS= read -r to actually read one line. I removed the special handling for empty lines which is pointless (there wouldn't be empty lines in the input). I presume you put it in to handle the case when the input is empty (zero lines of input), but the correct way to handle that is to check the return status of the read command. I also fixed log to print to its standard error, since it's used to process error messages.

See Prepending a timestamp to each line of output from a command for other ways to do this.

Beware that putting the command on the left-hand side of a pipe has a major downside: it causes the exit status to be ignored. So if cp fails, your script will continue happily. The errors will be logged somewhere, but subsequent commands will execute normally and nothing will alert you to the fact that you should go and read the logs. In bash, ksh or zsh, you can set the pipefail option in addition to set -e so that your script exits with an error status as soon as any command fails, even on the left-hand side of a pipeline.

set -o errexit -o pipefail
copy … |& log

Alternatively, use process substitution instead of a pipe to pipe the error output through another process. Process substitution has slightly different caveats than pipes; errors in log are effectively ignored, and the command may return before log has completed (in bash, log runs in the background).

set -e
copy … 2> >(log)

¹ _{Almost, you'd need IFS= read -r IN to never read more and not possibly mangle the line.}

Piping errors to my function causes command to be ignored after 1 failure

1 Answers1