Using dd
is not safer or faster or less reliable. In fact, here, it introduces two additional risks of failure. Neither risk is likely to be a problem in practice if people follow those instructions manually, but they would be significant bugs if the instructions were in an automated script.
Bug: race condition
Observe:
bash-5.0$ echo hello | tee >(sleep 1; echo done); echo next step
hello
next step
bash-5.0$ done
In bash, output process substitutions are asynchronous. When a command contains a process substitution >(…)
, it doesn't wait for the process substitution to finish.
So when … | tee >(dd of=/dev/sda) | sha256sum
returns, there may be data that's still in transit through dd
. This is very unlikely to last long enough for a human to react and type another command, but it could break a script that runs some other command like eject
or mount
afterwards.
Bug: missing error detection
Let's start a the nominal case where everything works.
bash-5.0$ head -c 1m </dev/zero | tee >(cat >/dev/null) | wc -c; echo $?
1048576
0
Now let's see what happens if the data writing command fails.
bash-5.0$ head -c 1m </dev/zero | tee >(false) | wc -c; echo $?
8192
0
The command has a success status because the exit status of a pipeline only depends on the right-hand side. The idea is that if you pipe a data producer into a data processor, it's the job of the data processor to detect failures. Unfortunately, this can only apply when the data format allows the data processor to detect failures, which is not the case in general, and in particular is not the case here.
Note that tee
completely gave up once it failed to write to the pipe connected to false
. Since false
never read any data, the only data that made it through to wc -c
is two PIPE_BUF
(one that tee
wrote to both pipes, and one that tee
wrote only to the pipe to wc
and failed to write to the pipe to false
). Depending on the relative timing of false
exiting vs tee
writing to the pipes and wc
consuming the data, it's possible that only one or 0 PIPE_BUF
made it through.
It's possible to detect the failure of tee
by setting the pipefail
option. (This possibility exists in ksh, in bash and in zsh but not in plain sh.)
bash-5.0$ set -o pipefail; head -c 1m </dev/zero | tee >(false) | wc -c; echo $?
8192
141
tee
failed to write to a pipe, so it died of a SIGPIPE, and the corresponding shell status is 128 + numerical value of SIGPIPE (which is 13 on Linux). Thanks to the pipefail
option, this causes the pipeline as a whole to exit with the same status.
Do note that the pipeline reflects the failure of tee
, and not directly the failure of the command in the process substitution. If the command in the process substitution successfully reads all the data but does not process it successfully, the error will not be detected.
bash-5.0$ head -c 1m </dev/zero | tee >(cat >/dev/null; false) | wc -c; echo $?
1048576
0
wc -c
processed all the data. cat >/dev/null; false
simulates a command that didn't process all of its input correctly. Nonetheless the command's status indicates a success.
What this means in your real-world example is that if there's an error at the end of the data, for example because the target device is very slightly smaller than the image, this error will not be detected (except through an error message from dd
).
Simple, correct solution
set -o pipefail
curl -L $iso | tee /dev/sda | sha256sum
Or, arguably simpler:
curl -L $iso | tee >/dev/sda >(sha256sum)
Note that without pipefail
, this second command will succeed if curl
fails. However, this failure is guaranteed to cause a wrong checksum.
A general note on the usage of dd
It seems like most tutorials that deal with putting iso files on installation media always use dd, which is why "is dd still relevant these days?" doesn't answer my question
Well, it did, more or less. Specifically, it answered the question of whether dd
serves any purpose: it doesn't. It didn't cover the specific problems in using dd
this particular way, which this time aren't actually due to dd
itself.
The reason most tutorials use dd
is that most tutorials use dd
. It's a self-perpetuating legend. People use dd
because they've seen it used elsewhere, even though they don't really understand why. Its syntax is unlike every other command and so it appears to be somewhat mysterious and powerful. But in dd of=/dev/sda
, all the power is in /dev/sda
and none in dd
. It's just a pretentious, fragile way of writing cat >/dev/sda
.