0

In very lot of answers, mainly about text-processing commands, I saw commands such as sed, awk, grep, among other, being used with STDIN and the simple open of a file

e.g.

$ sed -e 's|foo|bar|g' file # open file
$ sed -e 's|foo|bar|g' <file # open STDIN

or

$ grep 'PATTERN' file # open file
$ grep 'PATTERN' <file # open STDIN

In a personal, I use the open file method always, but I want to know when and when not to use them, also what's the difference.

tachomi
  • 7,592

4 Answers4

1

It depends on the need. Here's a case where using filenames or piping from stdin makes a difference.

bash-4.1$ cat /etc/passwd /etc/group | wc -l
128
bash-4.1$ wc -l /etc/passwd /etc/group
  49 /etc/passwd
  79 /etc/group
 128 total
bash-4.1$ 

Also, standard input tends to not be very lseek(3)able, so if an application needs a file descriptor it can seek on (e.g. to rewind back to the beginning), that would probably rule out using standard input with it.

thrig
  • 34,938
0

There is no difference in terms of the output.

$ grep 'PATTERN' file will open the file specified in argument 2 and will search for the pattern.

$ grep 'PATTERN' <file will read the contents of file into STDIN (one of bash's features), and will pipe STDIN into grep.

I am not sure if there are exact benefits of one or the other, but I would continue using the former instead of the latter.

The latter is redundant, just as cat file | grep 'PATTERN' and cat file | sed -e 's|foo|bar|g' are redundant.

Peschke
  • 4,148
0

There are instances in which the two are not exactly equivalent, like:

$ wc -l ./script.sh
4948 ./script.sh
$ wc -l <./so
4948

In the first instance wc knew which file was being processed and printed the name of the file with the count of lines. In the second, the command wc does not know what file is processing, it is an anonymous input.

In the specific command of your example with grep:

$ grep 'PATTERN' file
$ grep 'PATTERN' <file

There is no difference in the output. But with this, there is (if there is a match):

$ grep -H 'PATTERN' file
$ grep -H 'PATTERN' <file

Also, in the case of the redirection <file, it is the shell which is doing the reading of the file, that might also have consequences on speed or size of buffering (for different commands).

  • The shell does not read file. The shell merely opens file then manipulates file descriptors (e.g. using dup2) so that file is available on descriptor 0 (standard in). The shell will then exec grep, replacing itself and allowing grep to read standard input as it pleases. – Barefoot IO Mar 04 '16 at 00:08
0

The difference is mostly in who opens the file. This may be important for security reasons -- the shell may have privileges that the launched program does not have.

Using the STDIN method implies inheritance of the stream to the whole process tree created by the launched program. That may be useful in certain cases.

With the open file method, the launched program knows the filename more readily. Programs taking the filename route are likely to use the filename in their output, and the filename route may differ in other ways like performance: Programs often assume only basic stream access for STDIN (no seeking, no mmapability), and seekability for filename arguments.

Petr Skocik
  • 28,816