0

I have a lot of files that I need to sift through, specifically grep for certain keywords IN THE FILE, not he FILENAME. I have some 300+ files on a filesystem (so the files are within multiple different directories) and some have white space in their names.

when I search for filed using find

find -type f

and print the results, some of the file names go on different lines WHICH IS NOT GOOD.

How can I handle

2 Answers2

3

Some implementations of the grep command like GNU grep can recurse directories on their own. If you're looking for string "blahblah" you can do this:

$ grep -r "blahblah" .

This will grep through all the files and directories recursively starting at the current directory, . (beware that some implementations, like old versions of GNU grep will also follow symbolic links when descending the directory tree). That will show you the file names and the result that matched the search pattern, one per line. If you just want the filenames without the matching content add the -l switch to grep.

$ grep -rl "blahblah" .

If you really want to use find you can make use of find's ability to execute commands against the files that it finds, using the -exec switch.

$ find . -type f -exec grep "blahblah" {} +

The + at the end is key, since it will determine the optimal number of filenames that find locates, and call grep loading the command-line with as many as will fit. These filenames will be placed where the braces are, {}. It's easier to visualize what it's doing, if we substitute in the echo command in grep's place you'll hopefully see what I mean.

Example

Say I had the following sample data.

$ mkdir -p dir{1..3}
$ touch file{1..3}
$ touch dir{1..3}/file{A..C}

Now when I run the above find command using echo as our grep stand in:

$ find . -type f -exec echo {} +
./file2 ./file1 ./dir2/fileA ./dir2/fileB ./dir2/fileC ./dir3/fileA ./dir3/fileB ./dir3/fileC ./file3 ./dir1/fileA ./dir1/fileB ./dir1/fileC

All those files were echoed to the screen by a single echo so this method is very efficient at only calling echo or grep only the minimum number of times that are needed, passing as many filenames as it can to each.

By using -type f above, we're only searching in regular files, while grep would generally look in every file like fifos, sockets, devices (but not necessarily symlinks)... You can pass a -D skip option to some versions of grep to avoid looking into devices/sockets/fifos.

slm
  • 369,824
0
find . -type f -print0 | xargs -0 grep yourstring

The -print0 option to find, as well as its -0 pendant to xargs, use the null-byte as separator meaning your file names can be anything you like and there will be no surprises.

The actual problem is that the internal file separator of the shell (stored in the IFS variable) is a space. You could change it to be a newline or whatever you like but the real solution is to use what I showed above.

Marki
  • 855