34

I have a simple script that I understand most of, it's the find command that's unclear. I've got a lot of documentation but it's not serving to make it much clearer. My thought is that it is working like a for-loop, the currently found file is swapped in for {} and copied to $HOME/$dir_name, but how does the search with -path and -prune -o work? It's annoying to have such specific and relevant documentation and still not know what's going on.

#!/bin/bash
# The files will be search on from the user's home
# directory and can only be backed up to a directory
# within $HOME

read -p "Which file types do you want to backup " file_suffix
read -p "Which directory do you want to backup to " dir_name

# The next lines creates the directory if it does not exist
test -d $HOME/$dir_name || mkdir -m 700 $HOME/$dir_name

# The find command will copy files that match the
# search criteria ie .sh . The -path, -prune and -o
# options are to exclude the backdirectory from the
# backup.
find $HOME -path $HOME/$dir_name -prune -o \
-name "*$file_suffix" -exec cp {} $HOME/$dir_name/ \;
exit 0

This is just the documentation that I know I should be able to figure this out from.

-path pattern

File name matches shell pattern pattern. The metacharacters do not treat / or . specially; so, for example, find . -path "./sr*sc" will print an entry for a directory called ./src/misc (if one exists). To ignore a whole directory tree, use -prune rather than checking every file in the tree. For example, to skip the directory src/emacs and all files and directories under it, and print the names of the other files found, do something like this:

find . -path ./src/emacs -prune -o -print

From Findutils manual

-- Action: -exec command ; This insecure variant of the -execdir action is specified by POSIX. The main difference is that the command is executed in the directory from which find was invoked, meaning that {} is expanded to a relative path starting with the name of one of the starting directories, rather than just the basename of the matched file.

While some implementations of find replace the {} only where it appears on its own in an argument, GNU find replaces {} wherever it appears.

And

For example, to compare each C header file in or below the current directory with the file /tmp/master:

      find . -name '*.h' -execdir diff -u '{}' /tmp/master ';'
flerb
  • 963

2 Answers2

55

-path works exactly like -name, but applies the pattern to the entire pathname of the file being examined, instead of to the last component.

-prune forbids descending below the found file, in case it was a directory.

Putting it all together, the command

find $HOME -path $HOME/$dir_name -prune -o -name "*$file_suffix" -exec cp {} $HOME/$dir_name/ \;
  1. Starts looking for files in $HOME.
  2. If it finds a file matching $HOME/$dir_name it won't go below it ("prunes" the subdirectory).
  3. Otherwise (-o) if it finds a file matching *$file_suffix copies it into $HOME/$dir_name/.

The idea seems to be make a backup of some of the contents of $HOME in a subdirectory of $HOME. The parts with -prune is obviously necessary in order to avoid making backups of backups...

AlexP
  • 10,455
  • If I understand then: find will iterate through each and every directory in $HOME that it has permissions to go into, except $HOME/$dir_name, which it will not descend into (because the prune action will evaluate to true and the or will not be taken), searching for files that end with $file_suffix. Then as soon as it finds one, it will execute cp "found_file.sh" into $HOME/$dir_name ?

    Also, -path allows for a path to a file, and is useful when you want find to descend into directories and not just work in the current directory?

    – flerb Jul 07 '17 at 22:02
  • Your understanding is almost correct. -path works just as -name: it selects files. The difference is that -name matches a pattern to the file name, whereas -path matches a pattern to the full pathname. find always descends into subdirectories, unless prevented by -maxdepth or -prune etc. – AlexP Jul 07 '17 at 22:09
  • Oh! -path is being applied to $HOME/$dir_name -prune then, it's the order of commands that was messing me up, and -path is necessary for the prune command because it needs to match the full path of the pruned directory. – flerb Jul 07 '17 at 22:15
  • @Darren I'm not sure if that's quite accurate. -path $HOME/$dir_name is one action. It is a test that checks whether the path of the current file being examined matches whatever $HOME/$dir_name is. -prune is a separate action. I think the first sentence of your first comment accurately reflects how that works. – David Z Jul 08 '17 at 06:46
  • Would it be missing something to see it as a pipe? I swapped -prune with -print and think the flow is clear now : find $HOME | -path $HOME/$dir_name | -print – flerb Jul 08 '17 at 14:09
  • @Darren: It is not equivalent to a pipe. I think you are misunderstanding how find works. Essentially, it walks the subtree given as an argument; for each file it finds it evaluates the given expression, stopping when the expression is known to be true or false. By default it walks all the tree, but -maxdepth and -prune can prevent it from descending further. -path is a predicate; it evaluates to true if the full pathname of the current file being considered matches the pattern, and to false otherwise. The implicit connector is "and". – AlexP Jul 08 '17 at 14:15
  • I see this now by using find $PATH -print and am see how to use -print as a tool to investigate what's happening – flerb Jul 08 '17 at 14:25
  • should the -o really be there. it means or. no? i am finding with -o they are still being parsed. – mjs Nov 30 '19 at 15:04
  • 1
    @momomo: I don't understand. What is being parsed? And yes, the -o must definitely be there, because the default connector is "and", and we don't want that. – AlexP Nov 30 '19 at 18:30
  • if you are oring and you wish to exclude the path which is what prune is for .. then you are doing nothing really. you can not exclude and then or ... exclude and ... – mjs Dec 01 '19 at 11:01
  • @momomo: I'm afraid that you are confused about how -o and the implicit -a connectors work. Hint: they short-circuit. -o will stop evaluating the find expression as soon as it finds its left operand true; that it, if the -prune is executed then the evaluation stops; which means that the -name ... -exec part is evaluated only if the -prune was not executed yet. – AlexP Dec 01 '19 at 18:15
  • I was not getting that when I ran the command. The directory pruned was till being searched. But who knows, maybe I made a mistake. You seem sure. – mjs Dec 04 '19 at 21:51
  • That was a brilliant, simple, accurate explanation. Thank you. I feel like I just woke up, and now I know kung fu. Every other time I looked at a description of prune, I felt like I was drudging through a class at Tiger Schulman's Karate (which is not where you might expect to learn kung fu). – Gregg Leventhal Mar 02 '22 at 16:08
8

It is part of the find command, the -exec statement.

It allows you to interact with the file/directory found by the find command.

find $HOME -path $HOME/$dir_name -prune -o -name "*$file_suffix" -exec cp {} $HOME/$dir_name/ \;

find $HOME means find files/directories in $HOME

To understand -path <some_path>, see `find -path` explained

To understand -prune, see https://stackoverflow.com/questions/1489277/how-to-use-prune-option-of-find-in-sh

-o means OR, so -path <some_path> OR -name *$file_suffix

-exec means execute the command.

cp {} $HOME/$dir_name/ copy any files matching to $HOME/$dir_name/

\; means terminate the -exec command

thecarpy
  • 3,935
  • Good idea to quote ‘{}' to prevent word-splitting if the filename contains whitespace and avoid race conditions during resolution of the paths to the matched files (see main 1 find). – David C. Rankin Aug 03 '23 at 06:31