Find command if filename doesn't exists in directory

Question

How I want to display the folder that do not have certain file. But the concern is, the file is same name but different cases.

Case study: In tools directory, there are subdirectories that contain readme/README file and some of them does not have. For example

/toola/readme
/toolb/README
/toolc/ (does not have readme file)

I want find command to display only toolc folder by using this command.

find . -maxdepth 2 ! -name '*readme*' -o ! -name '*README*' | awk -F "/" '{print $$2}' | uniq

But it doesn't work. It display all file since toola doesn't have README and toolb doesn't have readme

Kusalananda · Accepted Answer · 2018-11-24T14:22:28.323

You can't use find to look for files that do not exist. However, you may use find to look for directories, and then test whether the given filenames exists in those directories.

When using find to look for directories, make sure that you use -type d. Then test each of the found directories for the files README and readme.

Assuming the following directory hierarchy for some top-directory projects:

projects/
|-- toola
|   |-- doc
|   |-- readme
|   `-- src
|-- toolb
|   |-- doc
|   `-- src
|-- toolc
|   |-- README
|   |-- doc
|   `-- src
`-- toold
    |-- doc
    `-- src

Using find to find the directories directly under projects that does not contain a README or readme file:

$ find projects -mindepth 1 -maxdepth 1 -type d \
    ! -exec test -f {}/README ';' \
    ! -exec test -f {}/readme ';' -print
projects/toolb
projects/toold

Here, we find any directory directly under projects and then use the test utility to determine which one of the found directories do not contain either of the two files.

This is exactly equivalent of

find projects -mindepth 1 -maxdepth 1 -type d \
    -exec [ ! -f {}/README ] ';' \
    -exec [ ! -f {}/readme ] ';' -print

Another formulation of the above:

find projects -mindepth 1 -maxdepth 1 -type d -exec sh -c '
    for pathname do
        if [ ! -f "$pathname/README" ] &&
           [ ! -f "$pathname/readme" ]; then
            printf "%s\n" "$pathname"
        fi
    done' sh {} +

Here, we let a small in-line shell script do the actual testing for the two files and print the pathname of the directories that does not contain either of them. The find utility acts like a "pathname generator" of pathnames to directories for the in-line script to iterate over.

In fact, if the directory structure is like this, we may choose to not use find at all:

for pathname in projects/*/; do
    if [ ! -f "$pathname/README" ] &&
       [ ! -f "$pathname/readme" ]; then
        printf '%s\n' "$pathname"
    fi
done

Note the trailing slash in the projects/*/ pattern. It's this that makes the pattern only match directories (or symbolic links to directories).

A difference between doing it this way and using find is that with the above shell loop, we will exclude hidden directories under project and will include symbolic links to directories.

In all cases, we iterate over the pathnames of directories, and we test for the non-existence of the two filenames.

The only caveat is that the -f test will also be true for a symbolic link to a regular file.

Understanding the -exec option of `find`

Thanks for using -exec for its intended use and not as a weaker xargs.. — pipe, Oct 24 '18 at 12:51

fra-san · Answer 2 · 2019-01-29T17:48:05.110

Given that I vote for the clear and elegant solution by Kusalananda, I add that this kind of tasks look like operating on sets. A pure find tool alone doesn't fit well. Indeed it has to bring external tools in by using -exec.

A different approach could be using a compare/diff tool. For instance, assuming you have access to GNU find and a shell supporting process substitution (e.g. bash), and that you have no newline characters in your paths:

comm -2 -3 <(find ./tool* -maxdepth 0 -type d | sort) \
<(find ./tool* -iname "readme" -printf "%H\n" | sort)

Where:

comm compares two sorted files line by line; the options -2 -3 let it remove from its output results that are only in the second file or in both files.
-printf "%H\n" lets find print only the starting point under which the file was found, followed by a new line (we have to match the -maxdepth 0 option that defines the other list).

Tested with the tree:

$ find ./tool* -printf "%p %y\n" | sort
./toola d
./toola/doc d
./toola/readme f
./toola/src d
./toolb d
./toolb/doc d
./toolb/src d
./toolc d
./toolc/doc d
./toolc/README f
./toolc/src d
./toold d
./toold/doc d
./toold/doc/readme f
./toold/src d

The command above gives:

./toolb

Stéphane Chazelas · Answer 3 · 2018-10-24T06:44:30.043

With zsh:

set -o extendedglob # for (#i) for case insensitive matching

all_projects=(projects/*(-/))
typeset -aU projects_with_readme # -U for unique
projects_with_readme=(projects/*/(#i)readme(:h))
projects_without_readme=(${all_projects:|projects_with_readme})

echo Projects with READMEs:
printf ' - %s\n' $projects_with_readme
echo Projects without READMEs:
printf ' - %s\n' $projects_without_readme

You can change the (#i)readme to (#i)*readme* to account for files called README.txt or 000README, or the (:h) to (-.:h) to only consider readme files that are regular after symlink resolution (exclude directories, broken links and other special types of files).

score 0 · Answer 4 · answered Oct 24 '18 at 05:31

This is a partial answer, but there was too much to write as a comment. There are several things wrong with this command.

Firstly, your logic is wrong. You probably want -a instead of -o. Your command:

find . -maxdepth 2 ! -name '*readme*' -o ! -name '*README*'

will find files that (do not have readme in them) OR (do not have README in them). If you simply run your command, you will see that it returns all files in your tree. Hence, you could use

find . -maxdepth 2 ! -name '*readme*' -a ! -name '*README*'

Secondly, you probably don't even need this construct. If you have a read of man find, you can see that there is an option called -iname, which is a case-insensitive version of -name. Hence, you can do

find . -maxdepth 2 ! -iname '*readme*'

Finally, you can see that if you run these commands, it returns anything that doesn't have the string in its name. Hence, the parent directories, including toola and toolb will appear, because they don't have the string in their name. This is expected, because, if you look at the output, there is no readme or README in the line at all. That is just the file within the directory.

thanks Sparhawk. Actually I already tried those command, which are using '-a' and '-iname', it stil doesn't work — daffodil, Oct 24 '18 at 05:43
This looks like it would match all other files though (main.c, for example) — Chris Davies, Oct 24 '18 at 06:23

Michael Prokopec · Answer 5 · 2018-11-24T15:25:22.470

0

I would use the following:

find -type d -maxdepth 2 -not -name '*readme*' | awk -F "/" '{print $$2}' | uniq

edited Nov 24 '18 at 15:25

answered Nov 24 '18 at 14:53

Michael Prokopec

2,220

Note that the user wants to find all directories that does not contain a file called readme or README. – Kusalananda Nov 24 '18 at 15:01

score 0 · Answer 6 · answered May 03 '22 at 10:31

0

Using find + sed

find missing-foo 2>&1 > /dev/null  | sed 's@find: ‘\([^’]*\).*@\1@'

Find prints to stderr if the file doesn't exists. Swap stdout/stderr and filter with sed. Ugly and probably brittle, but works for me.

answered May 03 '22 at 10:31

CervEd

174

Find command if filename doesn't exists in directory

6 Answers6