44

We've seen a few posts here lately that use this:

var=$(</dev/stdin)

in an attempt to read the shell's standard input into a variable.

That is however not the correct way to do it on Linux-based systems and Cygwin at least.

Why? What are the correct ways?

1 Answers1

62

(and no, for once, this is not about the missing quotes around $(...)¹).

The $(<file) operator

That Korn shell operator (also supported by zsh and bash) is described at length at Understanding Bash's Read-a-File Command Substitution.

In short, that's functionally equivalent to $(cat < file) except that the reading of the file is done internally by the shell instead of asking cat to do it and except in bash, without even forking an extra process².

In bash, it's virtually the same as $(cat < file) where cat would be a builtin cat².

bash has that other limitation that it only works for stdin input file redirection, not for other forms of redirections such as $(<&3) or $(<<<foo).

/dev/stdin

The /dev/stdin, /dev/stdout, /dev/stderr and /dev/fd/x are special files that were added to various Unices in the 80s so the file descriptors of a process could be referred to by name.

On those Unices, opening /dev/stdin (a character device file) got you a file descriptor that was a duplicate of stdin (fd 0), so the equivalent of doing dup(0)³.

When Linux added a similar feature in the '90s, the implementation was significantly different and incompatible.

On Linux, those /dev/std..., /dev/fd/x files are not special character device files but symbolic links to /proc/self/fd/x, and those in turn are magic symlinks to the file that is opened on fd x.

So, opening /dev/stdin there is not the same as dup(0); it's opening the original file anew assuming you have permissions to do so, and from the start (not at the offset stdin is currently pointing within the file) and in the requested mode. That also means that if you're reading/writing/seeking from the fd you get which is independent from fd 0, you're not updating stdin's offset within the file.

Cygwin copied the Linux way when it added a similar feature in the 2000s. Most if not all other Unices behave the original way (when they support those /dev/fd/x at all).

So why is it wrong?

Because $(</dev/stdin) opens /dev/stdin for reading and reads from the file descriptor that results from that as opposed to reading from stdin directly, on Linux and Cygwin, where it's not the same thing, you can easily end up not reading the right thing, or failing altogether to read anything and failing to tell the rest of the script that you've read stdin.

Consider these examples:

$ cat wrong
#! /bin/bash -
var=$(</dev/stdin)
printf 'I got: "%s"\n' "$var"
printf "This is how many bytes are left to read on stdin: "
wc -c
$ cat right
#! /bin/bash -
var=$(cat)
printf 'I got: "%s"\n' "$var"
printf "This is how many bytes are left to read on stdin: "
wc -c
$ cat file
1
2
3
4
5
$
$ ./wrong < file
I got: "1
2
3
4
5"
This is how many bytes are left to read on stdin: 10
$ ./right < file
I got: "1
2
3
4
5"
This is how many bytes are left to read on stdin: 0

See how even though in that case, wrong did appear to read all the lines of stdin, it actually appeared as if it hadn't consumed it. wc -c was still able to read 10 bytes from it.

$ { read var; ./wrong; } < file
I got: "1
2
3
4
5"
This is how many bytes are left to read on stdin: 8
$ { read var; ./right } < file
I got: "2
3
4
5"
This is how many bytes are left to read on stdin: 0

See how wrong got the first line of file even though it was invoked when the script's stdin was past that first line.

$ socat -u file:file exec:./wrong
./wrong: line 2: /dev/stdin: No such device or address
I got: ""
This is how many bytes are left to read on stdin: 10
$ socat -u file:file exec:./right
I got: "1
2
3
4
5"
This is how many bytes are left to read on stdin: 0

wrong was unable to open /dev/stdin, because it's a socket and you can't open() a socket.

$ chmod 600 file
$ sudo -u other_user ./wrong < file
./wrong: line 2: /dev/stdin: Permission denied
I got: ""
This is how many bytes are left to read on stdin: 10
$ sudo -u other_user ./right < file
I got: "1
2
3
4
5"
This is how many bytes are left to read on stdin: 0

right is just reading from fd 0 which was opened by me, but wrong is trying to reopen file as other-user who doesn't have the right to do so.

On Linux/Cygwin, $(</dev/stdin) only works in a few simple cases: when /dev/stdin is opened on non-seekable files (like pipes and some character devices such as ttys) that are openable (not socket, and where you have read permission). For some other cases, such as when stdin is opened at the start of a seekable file you have permission to open, it may appear to work, but fails to consume the input.

The correct ways

As seen above:

var=$(cat)

Is the correct way⁴. cat reads from its fd 0 (stdin) and writes to its fd 1, here a pipe while the shell reads the output at the other end to fill up $var.

cat is not the only command that does so, but it's the simplest one and when not passed any option, it doesn't try to interpret the input as text and never modifies it.

In ksh93 or zsh, you can do var=$(<&0) instead (<&0 being a no-op, but you need at least one redirection), but in zsh, that's not an optimisation, as it just does var=$($NULLCMD <&0), $NULLCMD being cat by default.

For text input (text being meant not to contain NUL characters), with zsh or bash, you can do:

{ ! IFS= read -rd '' var; } < file

read reads up to the first NUL delimiter and returns success iff it has found a delimiter. Here, we don't expect it to find a delimiter, so we negate its exit status. That does mean that if the file can be opened but not read, we won't get the right exit status.

Further considerations

Command substitution ($(cat)) and the $(<file) operator remove all trailing newline characters from the input. So technically, after var=$(cat), $var will not contain the whole input but the whole input minus the trailing newline characters.

For the whole input, you can do:

var=$(cat; ret=$?; echo . && exit "$ret")
ret=$? var=${var%.}

(with the exit status of cat preserved in $ret).

Except with zsh, if there are NUL bytes in the input, they won't be preserved in $var as no other shell supports storing those in their variables.

$ printf 'a\0b' | ksh -c 'var=$(cat); printf "Got: <%s>\n" "$var"' | sed -n l
Got: <a>$
$ printf 'a\0b' | mksh -c 'var=$(cat); printf "Got: <%s>\n" "$var"' | sed -n l
Got: <ab>$
$ printf 'a\0b' | bash -c 'var=$(cat); printf "Got: <%s>\n" "$var"' | sed -n l
bash: line 1: warning: command substitution: ignored null byte in input
Got: <ab>$
$ printf 'a\0b' | dash -c 'var=$(cat); printf "Got: <%s>\n" "$var"' | sed -n l
Got: <ab>$
$ printf 'a\0b' | zsh -c 'var=$(cat); printf "Got: <%s>\n" "$var"' | sed -n l
Got: <a\000b>$

¹ here the $(...) is used in the value of a scalar (not array) variable assignment, not in list context, so no split+glob can happen upon the expansion. So, while they wouldn't harm, adding quotes around the $(<...) makes no difference.

² another difference is that except in recent versions of zsh, read errors are silently ignored. var=$(</); echo "$? <$var>" for instance doesn't report an error, though bash (as opposed to ksh93 or mksh) does return with a non-zero exit status.

³ at least as long as the file is opened in a mode that is compatible with the mode in which the fd was opened. exec >/dev/stdin generally wouldn't work if stdin (fd 0) was opened in read-only mode for instance.

⁴ and is standard, in contrast to $(<file) which is only found in ksh/zsh/bash, and /dev/stdin which is not found on all Unices.

Toby Speight
  • 8,678