Using same filename for the input in sub-shell and also as output in parent shell will conflict?

Question

Referring to this answer of mine which I used the same tmp file for the input to the process-substitution and also writing the output to the same tmp file in parent shell, does this will cause any interruption when reading in process-substitution and writing by the shell?

based on discussion in the comments and a similar post I found, seems that should not be conflict, right?

grep -xvFf <(cut -d'/' -f1 tmp) ext >> tmp

Related discussions in comments:

This looks very elegant, but isn't tmp is being read and written to at the same time? – Quasimodo

@Quasímodo no, child-shell opens the tmp for reading only and redirection is done after grep done the processioning thought and shell open the tmp again for writing (But still I'm not expert on this to 100% confirm that) – αғsнιη

@Quasímodo strace grep … shows that it opens by grep and close after finish, and so after grep done shell write the output to tmp, so that will not interrupt at the same time – αғsнιη

grep will first read the whole patterns file, then parse the other file. So before writing any output, it has finished reading the patterns. No recycling of lines will happen. So it will append the output, I guess that even this would work as expected: grep -f tmp ext >> tmp. — thanasisp, Nov 15 '20 at 23:10
if you're on Linux, install the moreutils package and use sponge. Basically, you'd replace the >> tmp with sponge -a tmp — , Nov 19 '20 at 07:39

ilkkachu · Accepted Answer · 2020-11-16T09:13:19.780

Depends on what you're doing.

For grep -f patterns, grep will pretty much have to read the patterns file up front, before starting to read the actual data file. Otherwise it couldn't know if the first line matches. So here, you're safe.

Of course if you were use a truncating redirection to the file, it might well get truncated before the command in the process substitution got to read it. But see below.

In general, with the output redirected in append mode, I'd be worried about looping the output back to the input. Trying out with GNU grep, it's smart enough to warn about this:

$ seq 99999 > foo.txt
$ grep ^1 foo.txt >> foo.txt 
grep: input file ‘foo.txt’ is also the output

But if we trick it with a process substitution, it works, and parts of the data get processed repeatedly:

$ grep ^1 <(cat foo.txt) >> foo.txt
$ grep -Fx 1933 foo.txt
1933
1933
1933
1933

There should of course be just two copies of 1933 there. Your mileage may vary.

For the truncating redirection, I tested with the below on Linux:

echo moi > hello.txt
cat <(cat hello.txt) > hello.txt >&2;

Here, if the redirection to hello.txt is processed before the inner cat runs, the result will be no output. On the other hand, if the cat in the process substitution runs first, it might get to read the file before it's truncated. Looping that a few times:

for x in {1..999}; do echo moi > hello.txt; cat <(cat hello.txt) 3> hello.txt; done

gives no output on my system if it's idle, but outputs moi a few to a few dozen times if a simple busyloop is running at the same time. (3> redirection to just truncate the file without affecting the output.)

@thanasisp, for something like foo <(cat hello.txt) > hello.txt I wouldn't be sure about the order. In principle that manual you linked to says that expansions happen first (well, the redirection filename might come from an expansion), and that includes process substitutions. But on the other hand, process substitutions involve starting another asynchronous process, so there's no promises. The shell might well get to the redirection before the other process manages to start and read the file. But anything can happen, and does e.g. on my system under load. See edit.. — ilkkachu, Nov 16 '20 at 00:21

Using same filename for the input in sub-shell and also as output in parent shell will conflict?

1 Answers1

Linked