24

I thought I had a good handle on bash file redirection, and generally I try to avoid "useless use of cat", but I experienced some unexpected behaviour with a script and I would like to understand why it occurs.

Within a bash script, I execute:

somecommand < file1 > file2

My expectation was that file1 is safe and opened in a read-only manner. In practice, I found that file1 can be overwritten. How/why does this happen, and is there a way to prevent it without resorting to a cat?

If it's working how I imagine (the process ends up with a direct rw file descriptor?), it seems like it should be considered dangerous to redirect files this way, yet I've never seen this behaviour mentioned before.

To add some specifics from my case: the command in question is sops, which in the background is doing some GPG stuff. The GPG password prompt is sometimes being written to the file used for input, overwriting it. The complete command I used is:

sops --input-type json --output-type json -d /dev/stdin < ./secrets/file.json > ./secrets/file-decrypted.json

I have since switched to cat file1 | sops.. > file2 and everything works as expected. I would have said this was a "useless use of cat" - but it doesn't seem so useless anymore!


It seems to be when gpg-agent is not running and prompts for the first time.
Lewis
  • 353
  • I don't see how cat would change anything here. Are you sure it actually solves the issue? Your file1 can still be overwritten just like before. You said it only happens "sometimes", so maybe it just hasn't happened yet with your cat version? – terdon Oct 05 '20 at 15:23
  • Pretty sure. I can reliably reproduce it by killing my gpg agent. I don't know what funky things pinentry is doing behind the scenes, but it doesn't do it when I use the cat. – Lewis Oct 05 '20 at 15:29
  • The command line you've provide must not overwrite the input file under no circumstances. The fact that it's happening indicates that something else is doing that at a different time. – Artem S. Tashkinov Oct 05 '20 at 15:30
  • 1
    What Artem said. There's nothing in your command that would touch the input file, so something else must be doing it. Whether you open it with cat or < will make no difference. – terdon Oct 05 '20 at 15:31
  • @ArtemS.Tashkinov I am 100% sure nothing else is touching these files. I checked things very carefully before submitting the question as I didn't think it was possible either. :) – Lewis Oct 05 '20 at 15:33
  • 2
  • Lewis, as it turns out (if I have read Stéphane's answer correctly), the problem arises because of /dev/stdin. The lack of /dev/stdin in the first code block can be confusing, as you would be safe with only the < file redirection (again, if I understood it correctly). – Quasímodo Oct 05 '20 at 16:39
  • @Quasímodo No, in essence the problem was my misunderstanding of the < redirection. I expected the program to receive a copy of the data, when in fact it is getting the "real thing". The /dev/stdin arg is not particularly relevant. – Lewis Oct 05 '20 at 17:12
  • There are a lot of corner cases where a "useless" use of cat is not only useful, but the only way out. The /dev/stdin argument is very relevant. But a UUoC does not always help: this will go into a infinite loop: echo text | tee /dev/fd/0. Can you guess why? As to the idea that redirections let a program receive "a copy of the data", I don't see how someone would happen upon it other than by reading bad stackexchange answers or other such "tended garden"/hivemind products ;-) –  Oct 06 '20 at 02:29
  • @user414777 No, /dev/stdin is absolutely not relevant. In this case it doesn't matter what is passed to sops as the -d argument, it is the file attached to fd0 that will be overwritten. This seems to be a bug/bad assumption in the included mozilla/gogpgagent. – Lewis Oct 06 '20 at 08:26
  • 1
    @Lewis even if that happens when you are not using /dev/stdin, it still happens because that script/program is opening /dev/stdin explicitly at some point (or one of its "aliases": /dev/fd/0, /proc/self/fd/0, /proc/<pid>/fd/0, etc). A process is not able to write to file descriptor open in read-only mode, as fd 0 is in cmd < file. –  Oct 06 '20 at 08:30
  • @user414777 Yes. I linked to the specific line of code that does this. – Lewis Oct 06 '20 at 08:32
  • @Lewis Sorry, I didn't notice. You should submit a bug report -- that's pretty broken. –  Oct 06 '20 at 08:43
  • FWIW, "useless use of cat" isn't a consensus problem. Removing cat in favor of redirection is an optimization that often comes at the cost of clarity; in many cases one should prefer clarity, even with a "useless" cat. – Reid Oct 06 '20 at 20:35

1 Answers1

31

That's due to the way /dev/stdin (actually /proc/self/fd/0) is implemented on Linux (and Cygwin, but generally not other systems).

On Linux opening /dev/stdin is not like doing a dup(0), it just reopens the same file as open on fd 0 anew. It doesn't share the open file description that fd 0 refers to (with the readonly mode), but gets a completely unrelated new open file description, with the mode as specified in open().

So if sops -d /dev/stdin opens /dev/stdin in read+write mode and fd 0 was open in read-only on /some/file, /some/file will be open in read+write.

Effectively, cmd /dev/stdin < file there is the same as cmd file < file. You'll find that /dev/stdin is just a symlink¹ to file:

/tmp$ namei -l /dev/stdin < file
f: /dev/stdin
drwxr-xr-x root     root     /
drwxr-xr-x root     root     dev
lrwxrwxrwx root     root     stdin -> /proc/self/fd/0
drwxr-xr-x root     root       /
dr-xr-xr-x root     root       proc
lrwxrwxrwx root     root       self -> 73569
dr-xr-xr-x stephane stephane     73569
dr-x------ stephane stephane   fd
lr-x------ stephane stephane   0 -> /tmp/file
drwxr-xr-x root     root         /
drwxrwxrwt root     root         tmp
-rw-r--r-- stephane stephane     file

It can get worse. If it was opening with O_TRUNC, the file would be truncated. If fd 0 was pointing to the reading end of a pipe and /dev/stdin was open in write-only mode, you'd get the other end of the pipe.

But using:

cat file | cmd /dev/stdin

Would guard against cmd overwriting file as all cmd would see would be the pipe. And even if it did open in write-only mode, it couldn't get back to the file, it would just get to the writing end of the pipe and the only file descriptor on the reading end would be cmd's stdin.

Other OSes don't have the problem as opening /dev/stdin there is like doing a dup(0), so you get the same open file description and if you open with an incompatible mode, the open() system call just fails.


¹ technically, as noted by @user414777 in comments, /proc/<pid>/fd/<fd> are magic symlinks in that for instance they can reach into places that normal symlinks could not, but when it comes to opening them, past the path resolution stage, they act like normal symlinks in that you just open the target file

  • But isn't that what the OP is doing already? They're running sops [...] -d /dev/stdin < ./secrets/file.json, so they aren't passing the file as an argument. Isn't that equivalent to doing cat ./secrets/file.json |sops ...? – terdon Oct 05 '20 at 15:46
  • Example: </dev/null bash -c 'exec 30<>/dev/stdin; ls -l "/proc/$$/fd"'. The file descriptor 30 will be rw despite 0 being r-. Compare with exec 30>&0. – Kamil Maciorowski Oct 05 '20 at 15:47
  • 1
    @terdon, no, and that's the point on Linux (and Linux and Cygwin only), cmd -d /dev/stdin < file is really like cmd -d file < file which lets cmd do whatever it likes with file regardless of how file was open on stdin. – Stéphane Chazelas Oct 05 '20 at 15:51
  • I don't like any of this. Every howto on the Internet says you'd better avoid cating files as it's redundant and now it turns out stdin redirection is not safe. – Artem S. Tashkinov Oct 05 '20 at 15:52
  • @StéphaneChazelas So, it's only possible when the application gets a file descriptor as an argument? Not, when it's just < stdin ? – Artem S. Tashkinov Oct 05 '20 at 15:53
  • So you're saying that < file passes the file as input and not a stream of the file's contents? Wouldn't that mean that something like ls < file would work to run ls file? I'm obviously missing something, I know. Could you maybe elaborate a bit on your answer? – terdon Oct 05 '20 at 15:54
  • @ArtemS.Tashkinov The application can open its /dev/stdin (like Bash in my previous comment does) because it was programmed (hardcoded) to do so. It doesn't need to get it as argument. – Kamil Maciorowski Oct 05 '20 at 15:56
  • 2
    So in other words.. you need to trust a program not to use /dev/stdin? And so it is indeed potentially dangerous to be in the habit of redirecting a file for input this way? – Lewis Oct 05 '20 at 15:56
  • 7
    @terdon, < file doesn't pass a stream or file, it just opens file with O_RDONLY on fd 0. The application can only read from that fd. The problem is just with those special /dev/stdin / /dev/fd, /proc/self/fd symlinks which on Linux are just symlinks to the corresponding files without any relation to how the corresponding fds were opened. – Stéphane Chazelas Oct 05 '20 at 15:58
  • 4
    @Lewis You already "trust" programs they don't read and publish your ~/.ssh/id_rsa. – Kamil Maciorowski Oct 05 '20 at 16:00
  • 3
    @KamilMaciorowski True, but somehow this feels different. I didn't realise my data was at risk when passed like this :) – Lewis Oct 05 '20 at 16:02
  • 2
    They're not "just symlinks". They're "magic" symlinks, which can be opened even when their target doesn't resolve or resolves to an inaccessible path. –  Oct 06 '20 at 02:34
  • It seems that the "magic-link" is now a technical term (/RESOLVE_NO_MAGICLINKS). Anyways, I suspect (though I haven't tested it) that such magic-links could be implemented by any fs, not just /proc (e.g. by a FUSE fs). –  Oct 06 '20 at 03:35
  • 3
    Just to follow up, in my case it appears the embedded mozilla-services/gogpagent is writing to /proc/self/fd/0. – Lewis Oct 06 '20 at 08:40
  • ".../proc//fd/ are magic symlinks ... but when it comes to opening them, past the path resolution stage, they act like normal symlinks in that you just open the target file" -- no, they're magic on opening too. That's how they can reach into places normal symlinks can't. E.g. if an fd points to a deleted file, calling readlink on the corresponding /proc/pid/fd/n link gives /some/file (deleted), and for pipes they give something like pipe:[36773091]. You can't use those as paths, but opening the magic links gives the file that was /some/file and the anonymous pipe. – ilkkachu Feb 20 '23 at 18:53