The cat command can do things that the shell can't necessarily do (or at least, can't do easily). For example, suppose you want to print characters that might otherwise be invisible, such as tabs, carriage returns, or newlines. There *might* be a way to do so with only shell builtin commands, but I can't think of any off the top of my head. The GNU version of cat can do so with the -A
argument or the -v -E -T
arguments (I don't know about other versions of cat, though). You could also prefix each line with a line number using -n
(again, IDK if non-GNU versions can do this).
Another advantage of cat is that it can easily read multiple files. To do so, one can simply type cat file1 file2 file3
. To do the same with a shell, things would get tricky, although a carefully-crafted loop could most likely achieve the same result. That said, do you really want to take the time to write such a loop, when such a simple alternative exists? I don't!
Reading files with cat would probably use less CPU than the shell would, since cat is a pre-compiled program (the obvious exception is any shell that has a builtin cat). When reading a large group of files, this might become apparent, but I have never done so on my machines, so I can't be sure.
The cat command can also be useful for forcing a command to accept standard input in instances it might not. Consider the following:
echo 8 | sleep
The number "8" will be not accepted by the "sleep" command, since it was never really meant to accept standard input. Thus, sleep will disregard that input, complain about a lack of arguments, and exit. However, if one types:
echo 8 | sleep $(cat)
Many shells will expand this to sleep 8
, and sleep will wait for 8 seconds before exiting. You can also do something similar with ssh:
command | ssh 1.2.3.4 'cat >> example-file'
This command with append example-file on the machine with the address of 1.2.3.4 with whatever is outputted from "command".
And that's (probably) just scratching the surface. I'm sure I could find more example of cat being useful if I wanted to, but this post is long enough as it is. So, I'll conclude by saying this: asking the shell to anticipate all of these scenarios (and several others) is not really feasible.
lseek
is still defined behaviour and could cause a different outcome, the different blocking behaviour can be semantically meaningful, etc. It would be allowable to make the change if you knew what the other commands were and knew they didn't care, or if you just didn't care about compatibility at that level, but the benefit is pretty small. I do imagine the lack of benefit drives the situation more than the conformance cost. – Michael Homer Apr 11 '19 at 07:57cat
itself, though, or any other utility. It's also allowed to know how the other utilities that belong to the system work (e.g. it can know how the externalgrep
implementation that came with the system behaves). This is completely viable to do, so it's entirely fair to wonder why they don't. – Michael Homer Apr 11 '19 at 08:04grep
. Andsed
. Andawk
. Anddu
. And how many hundreds if not thousands of other utilities? – Andrew Henle Apr 11 '19 at 11:01ksh93
) which do implement some external commands internally. I believe they check that the command found by searching$PATH
is the system command (/bin/cat
) before using the internal copycat command. – jrw32982 Apr 11 '19 at 19:44cat "$MYFILE" | command1 | command2 > "$OUTPUT"
because it's defensive programming which makes explicit to even the most junior user exactly what's happening. – RonJohn Apr 12 '19 at 15:21sort "$MYFILE"
can split into a bunch of threads each reading and sorting a different subset of a file into a different temporary file, and then merge them together at the end;cat "$MYFILE" | sort
is forced to read front-to-back. Andcat "$MYFILE" | tail
on a 5GB file needs to read the whole 5GB front-to-back to get to the end, but GNUtail
is smart enough to jump to the end and read from there in 8KB chunks if given a seekable file handle (as bytail "$MYFILE"
ortail <"$MYFILE"
). – Charles Duffy Apr 12 '19 at 22:18<"$filename" sort
is safer thansort "$filename"
, but usingcat
is precluding optimizations, giving your programs less information (can't look up the name of the input file when it's a completely separate program that has the real handle on it!), hiding information about failure cases, and otherwise making life worse for the program reading that content. – Charles Duffy Apr 12 '19 at 22:45<file sort
is just as valid assort <file
, and both letsort
read direct from the real input file, and thus be able to seek/parallelize/etc. – Charles Duffy Apr 12 '19 at 22:46sort "$MYFILE" | command1 | command2
is a heck of a lot more straightforward than<"$filename" sort
. – RonJohn Apr 13 '19 at 00:31<"$MYFILE"
will never treat the contents of theMYFILE
variable as anything but a filename. BTW, all-caps variable names are in the namespace POSIX specifies for names meaningful to the OS and standard utilities, whereas lowercase variable names are guaranteed to be safe for application use. – Charles Duffy Apr 13 '19 at 00:46cat "$MYFILE" | command1 | command2 > "$OUTPUT"
is explicit about what's happening. Sure, it's less efficient, but efficiency isn't always the primary goal in programming. Lack of bugs and maintainability for years to come are quite often the higher priority. – RonJohn Apr 13 '19 at 00:52tail
-- piping fromcat
changes an O(1) algorithm into an O(n) one. – Charles Duffy Apr 13 '19 at 15:29cat "$MYFILE" | command1
is less bug-prone than<"$MYFILE" command1
. Please substantiate -- with your formulation,command1
can't tell if end-of-file was hit, or if there was an EIO or other read error; it loses its ability to do good error handling, and also loses the ability to look up the filename attached to the FD to include it in an error message. That's making your software less reliable and maintainable, not more. – Charles Duffy Apr 13 '19 at 15:36