(and no, for once, this is not about the missing quotes around $(...)
¹).
The $(<file)
operator
That Korn shell operator (also supported by zsh
and bash
) is described at length at Understanding Bash's Read-a-File Command Substitution.
In short, that's functionally equivalent to $(cat < file)
except that the reading of the file is done internally by the shell instead of asking cat
to do it and except in bash
, without even forking an extra process².
In bash
, it's virtually the same as $(cat < file)
where cat
would be a builtin cat
².
bash
has that other limitation that it only works for stdin input file redirection, not for other forms of redirections such as $(<&3)
or $(<<<foo)
.
/dev/stdin
The /dev/stdin
, /dev/stdout
, /dev/stderr
and /dev/fd/x
are special files that were added to various Unices in the 80s so the file descriptors of a process could be referred to by name.
On those Unices, opening /dev/stdin
(a character device file) got you a file descriptor that was a duplicate of stdin (fd 0), so the equivalent of doing dup(0)
³.
When Linux added a similar feature in the '90s, the implementation was significantly different and incompatible.
On Linux, those /dev/std...
, /dev/fd/x
files are not special character device files but symbolic links to /proc/self/fd/x
, and those in turn are magic symlinks to the file that is opened on fd x.
So, opening /dev/stdin
there is not the same as dup(0)
; it's opening the original file anew assuming you have permissions to do so, and from the start (not at the offset stdin is currently pointing within the file) and in the requested mode. That also means that if you're reading/writing/seeking from the fd you get which is independent from fd 0, you're not updating stdin's offset within the file.
Cygwin copied the Linux way when it added a similar feature in the 2000s. Most if not all other Unices behave the original way (when they support those /dev/fd/x
at all).
So why is it wrong?
Because $(</dev/stdin)
opens /dev/stdin
for reading and reads from the file descriptor that results from that as opposed to reading from stdin directly, on Linux and Cygwin, where it's not the same thing, you can easily end up not reading the right thing, or failing altogether to read anything and failing to tell the rest of the script that you've read stdin.
Consider these examples:
$ cat wrong
#! /bin/bash -
var=$(</dev/stdin)
printf 'I got: "%s"\n' "$var"
printf "This is how many bytes are left to read on stdin: "
wc -c
$ cat right
#! /bin/bash -
var=$(cat)
printf 'I got: "%s"\n' "$var"
printf "This is how many bytes are left to read on stdin: "
wc -c
$ cat file
1
2
3
4
5
$
$ ./wrong < file
I got: "1
2
3
4
5"
This is how many bytes are left to read on stdin: 10
$ ./right < file
I got: "1
2
3
4
5"
This is how many bytes are left to read on stdin: 0
See how even though in that case, wrong
did appear to read all the lines of stdin, it actually appeared as if it hadn't consumed it. wc -c
was still able to read 10 bytes from it.
$ { read var; ./wrong; } < file
I got: "1
2
3
4
5"
This is how many bytes are left to read on stdin: 8
$ { read var; ./right } < file
I got: "2
3
4
5"
This is how many bytes are left to read on stdin: 0
See how wrong
got the first line of file
even though it was invoked when the script's stdin was past that first line.
$ socat -u file:file exec:./wrong
./wrong: line 2: /dev/stdin: No such device or address
I got: ""
This is how many bytes are left to read on stdin: 10
$ socat -u file:file exec:./right
I got: "1
2
3
4
5"
This is how many bytes are left to read on stdin: 0
wrong
was unable to open /dev/stdin
, because it's a socket and you can't open() a socket.
$ chmod 600 file
$ sudo -u other_user ./wrong < file
./wrong: line 2: /dev/stdin: Permission denied
I got: ""
This is how many bytes are left to read on stdin: 10
$ sudo -u other_user ./right < file
I got: "1
2
3
4
5"
This is how many bytes are left to read on stdin: 0
right
is just reading from fd 0 which was opened by me, but wrong
is trying to reopen file
as other-user who doesn't have the right to do so.
On Linux/Cygwin, $(</dev/stdin)
only works in a few simple cases: when /dev/stdin
is opened on non-seekable files (like pipes and some character devices such as ttys) that are openable (not socket, and where you have read permission). For some other cases, such as when stdin is opened at the start of a seekable file you have permission to open, it may appear to work, but fails to consume the input.
The correct ways
As seen above:
var=$(cat)
Is the correct way⁴. cat
reads from its fd 0 (stdin) and writes to its fd 1, here a pipe while the shell reads the output at the other end to fill up $var
.
cat
is not the only command that does so, but it's the simplest one and when not passed any option, it doesn't try to interpret the input as text and never modifies it.
In ksh93 or zsh, you can do var=$(<&0)
instead (<&0
being a no-op, but you need at least one redirection), but in zsh
, that's not an optimisation, as it just does var=$($NULLCMD <&0)
, $NULLCMD
being cat
by default.
For text input (text being meant not to contain NUL characters), with zsh
or bash
, you can do:
{ ! IFS= read -rd '' var; } < file
read
reads up to the first NUL delimiter and returns success iff it has found a delimiter. Here, we don't expect it to find a delimiter, so we negate its exit status. That does mean that if the file
can be opened but not read, we won't get the right exit status.
Further considerations
Command substitution ($(cat)
) and the $(<file)
operator remove all trailing newline characters from the input. So technically, after var=$(cat)
, $var
will not contain the whole input but the whole input minus the trailing newline characters.
For the whole input, you can do:
var=$(cat; ret=$?; echo . && exit "$ret")
ret=$? var=${var%.}
(with the exit status of cat
preserved in $ret
).
Except with zsh
, if there are NUL bytes in the input, they won't be preserved in $var
as no other shell supports storing those in their variables.
$ printf 'a\0b' | ksh -c 'var=$(cat); printf "Got: <%s>\n" "$var"' | sed -n l
Got: <a>$
$ printf 'a\0b' | mksh -c 'var=$(cat); printf "Got: <%s>\n" "$var"' | sed -n l
Got: <ab>$
$ printf 'a\0b' | bash -c 'var=$(cat); printf "Got: <%s>\n" "$var"' | sed -n l
bash: line 1: warning: command substitution: ignored null byte in input
Got: <ab>$
$ printf 'a\0b' | dash -c 'var=$(cat); printf "Got: <%s>\n" "$var"' | sed -n l
Got: <ab>$
$ printf 'a\0b' | zsh -c 'var=$(cat); printf "Got: <%s>\n" "$var"' | sed -n l
Got: <a\000b>$
¹ here the $(...)
is used in the value of a scalar (not array) variable assignment, not in list context, so no split+glob can happen upon the expansion. So, while they wouldn't harm, adding quotes around the $(<...)
makes no difference.
² another difference is that except in recent versions of zsh, read errors are silently ignored. var=$(</); echo "$? <$var>"
for instance doesn't report an error, though bash (as opposed to ksh93 or mksh) does return with a non-zero exit status.
³ at least as long as the file is opened in a mode that is compatible with the mode in which the fd was opened. exec >/dev/stdin
generally wouldn't work if stdin (fd 0) was opened in read-only mode for instance.
⁴ and is standard, in contrast to $(<file)
which is only found in ksh/zsh/bash, and /dev/stdin
which is not found on all Unices.