6

I have a directory that contains, among other files, 3 named pipes: FIFO, FIFO1, and FIFO11. If I try something like

grep mypattern *

in this directory, grep hangs forever on the named pipes, so I need to exclude them. Unexpectedly,

grep --exclude='FIF*' mypattern *

does not solve the problem; grep still hangs forever. However,

grep -r --exclude='FIF*' mypattern .

does solve the hanging problem (albeit with the undesired side effect of searching all the subdirectories).

I did some testing that shows that grep --exclude ='FIF*' mypattern * works as expected if FIFO etc. are regular files, not named pipes.

Questions:

  1. Why does grep skip --excludes in both cases if they're regular files, and skips --excluded named pipes in the recursive case, but doesn't skip named pipes in the non-recursive case?
  2. Is there another way to format the exclusion that will skip these files in all cases?
  3. is there a better way to accomplish what I'm after? (EDIT: I just discovered the
    --devices=skip flag in grep, so that's the answer to this part ... but I'm still curious about the first two parts of the question)
ras
  • 73

2 Answers2

7

It seems grep still opens files even if the regex tells it to skip them:

$ ll
total 4.0K
p-w--w---- 1 user user 0 Feb  7 16:44 pip-fifo
--w--w---- 1 user user 4 Feb  7 16:44 pip-file
lrwxrwxrwx 1 user user 4 Feb  7 16:44 pip-link -> file

(Note: none of these have read permissions.)

$ strace -e openat grep foo --exclude='pip*' pip-file pip-link pip-fifo
openat(AT_FDCWD, "pip-file", O_RDONLY|O_NOCTTY) = -1 EACCES (Permission denied)
grep: pip-file: Permission denied
openat(AT_FDCWD, "pip-link", O_RDONLY|O_NOCTTY) = -1 ENOENT (No such file or directory)
grep: pip-link: No such file or directory
openat(AT_FDCWD, "pip-fifo", O_RDONLY|O_NOCTTY) = -1 EACCES (Permission denied)
grep: pip-fifo: Permission denied
+++ exited with 2 +++

Granting read permissions, it appears that it doesn't try to read them after opening if they are excluded:

$ strace -e openat grep foo --exclude='pip*' pip-file pip-link pip-fifo
openat(AT_FDCWD, "pip-file", O_RDONLY|O_NOCTTY) = 3
openat(AT_FDCWD, "pip-link", O_RDONLY|O_NOCTTY) = -1 ENOENT (No such file or directory)
grep: pip-link: No such file or directory
openat(AT_FDCWD, "pip-fifo", O_RDONLY|O_NOCTTY^Cstrace: Process 31058 detached
 <detached ...>

$ strace -e openat,read grep foo --exclude='pip*' pip-file
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000\25\0\0\0\0\0\0"..., 832) = 832
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240\r\0\0\0\0\0\0"..., 832) = 832
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\t\2\0\0\0\0\0"..., 832) = 832
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260`\0\0\0\0\0\0"..., 832) = 832
openat(AT_FDCWD, "pip-file", O_RDONLY|O_NOCTTY) = 3
+++ exited with 1 +++

$ strace -e openat,read grep foo --exclude='pipe*' pip-file
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000\25\0\0\0\0\0\0"..., 832) = 832
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240\r\0\0\0\0\0\0"..., 832) = 832
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\t\2\0\0\0\0\0"..., 832) = 832
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260`\0\0\0\0\0\0"..., 832) = 832
openat(AT_FDCWD, "pip-file", O_RDONLY|O_NOCTTY) = 3
read(3, "foo\n", 32768)                 = 4
foo
read(3, "", 32768)                      = 0
+++ exited with 0 +++

And since openat wasn't called with O_NONBLOCK, the opening itself hangs, and grep doesn't reach the part where it excludes it from reading.

Looking at the source code, I believe the flow is like this:

  1. If not recursive, call grep_command_line_arg on each file.
  2. That calls grepfile if not on stdin.
  3. grepfile calls grepdesc after opening the file.
  4. grepdesc checks for excluding the file.

When recursive:

  1. grepdirent checks for excluding the file before calling grepfile, so the failing openat never happens.
  • Accepting this answer, as it describes the behavior and its cause exactly. For the sake of completeness, I will note that both grep --devices=skip mypattern * and the various solutions involving find solve the original problem. (I'll also note that this grep behavior seems like a suboptimal implementation!) – ras Feb 07 '19 at 18:50
1

why don't you combine with "find"? Get a list of nothing but files and grep into them:

find /path/to/dir -type f -exec grep pattern {} \;
  • Whilst the idea is good, the implementation is bad. I suggest find /path/to/dir -type f -maxdepth 1 -exec grep pattern /dev/null {} + - the maxdepth will limit it to the current directory. The change from \; to + will make find run grep for a batch of files, rather than running one grep for each one. Adding the /dev/null removes the very small chance of a run with a single file, which causes grep to change the output format. See https://unix.stackexchange.com/questions/275637/limit-posix-find-to-specific-depth if you lack maxdepth. – icarus Feb 07 '19 at 10:45