When redirecting input from a file, using program <file
has several advantages over cat file | program
. In addition to efficiency (as Sotto Voce noted), it also gives the program direct access to the file, which lets it do things other than just read it from beginning to end. Some examples:
It can read from the file out of order. If you use cat hugefile | tail
, the tail
program must read through the entire file to get to the end, but if you use tail <hugefile
, it can use lseek()
to skip to near the end of the file, and read backward until it has what it needs.
It can read only part of the file without causing trouble. If you use head <hugefile
, head
will read what it needs, and then exit (which closes the file); no muss, no fuss. But if you use cat hugefile | head
, when head
exits and closes the pipe, cat
will still be trying to push data into the now-destinationless pipe. To solve this, the system sends the SIGPIPE
signal to cat
. Many programs will print an error message when they get SIGPIPE
; the versions of cat
I've tested don't do that, but they do exit with an error status. If this is in a script that sets -e
and pipefail
modes (as in "Unofficial Bash Strict Mode"), bash will treat this as a fatal error and exit the script. (This sort of thing is why I don't recommend set -e
.)
It can access the file's properties. If you use pv <hugefile | slowprocessor
, pv
(the "pipe viewer" utility) will check the file's size, and give you a progress bar showing what percentage of the file has been sent so far, and also an estimated time to completion. But if you use cat hugefile | pv | slowprocessor
, pv
will have no idea how big the file is and only show the absolute amount that's been sent. (Note: pv
does have a -s
option that lets you explicitly tell it how big you think the file is.)
So overall, having direct access to the input file allows the program a lot more flexibility in how it uses the file.
Also, many programs (including all of the ones I've used as examples here) let you specify input files directly as command arguments (e.g. tail hugefile
instead of either tail <hugefile
or cat hugefile | tail
). This allows the program even more information and control over how it accesses the file. It also (again, for commands that support this) allows multiple input files (like cat |
would). So for commands that support it, it is usually preferred over either a pipe or input redirection.
tr ... < <(cat some-file)
orcat some-file | tr ...
instead of justtr ... < some-file
? Why? – muru Mar 15 '23 at 03:57<
, redirecting the process substitution, which is a pathname. This makes it no more "special" than redirecting from some file, just like with redirecting from stuff beneath/dev/fd
. – Kusalananda Mar 15 '23 at 06:38