25

I noticed recently that POSIX specifications for find do not include the -maxdepth primary.

For those unfamiliar with it, the purpose of the -maxdepth primary is to restrict how many levels deep find will descend. -maxdepth 0 results in only command line arguments being processed; -maxdepth 1 would only handle results directly within the command line arguments, etc.

How can I get the equivalent behavior to the non-POSIX -maxdepth primary using only POSIX-specified options and tools?

(Note: Of course I can get the equivalent of -maxdepth 0 by just using -prune as the first operand, but that doesn't extend to other depths.)

Wildcard
  • 36,499
  • @StevenPenny, FreeBSD's -depth -2, -depth 1... approach could be seen as better than GNU's -maxdepth/-mindepth – Stéphane Chazelas Dec 14 '16 at 16:33
  • @StéphaneChazelas either way - POSIX find should have one or the other; else it is crippled – Zombo Dec 14 '16 at 17:07
  • 1
    At least for -maxdepth/-mindepth, there are reasonable alternatives (note that -path is a recent addition to POSIX). The alternatives for -timexy or -mtime -3m (or -mmin -3) are a lot more cumbersome. Some like -execdir/-delete have no reliable alternative. – Stéphane Chazelas Dec 14 '16 at 17:18
  • 2
    @StevenPenny, feel free to log a ticket at http://austingroupbugs.net/ to request it be added. I've seen things get added without the need for a sponsor when there was a strong justification. A probably better course of action would be to get as many implementations add it first so POSIX would just have to specify the existing which is generally less contentious. – Stéphane Chazelas Dec 14 '16 at 17:22
  • @StéphaneChazelas in my case I ended up just naming the files directly, but thank you; I might file a ticket if this comes up again – Zombo Dec 14 '16 at 17:31

3 Answers3

25

@meuh's approach is inefficient as his -maxdepth 1 approach still lets find read the content of directories at level 1 to later ignore them otherwise. It will also not work properly with some find implementations (including GNU find) if some directory names contain sequences of bytes that don't form valid characters in the user's locale (like for file names in a different character encoding).

find . \( -name . -o -prune \) -extra-conditions-and-actions

is the more canonical way to implement GNU's -maxdepth 1.

Generally though, it's depth 1 you want (-mindepth 1 -maxdepth 1) as you don't want to consider . (depth 0), and then it's even simpler:

find . ! -name . -prune -extra-conditions-and-actions

For -maxdepth 2, that becomes:

find . \( ! -path './*/*' -o -prune \) -extra-conditions-and-actions

And that's where you run in the invalid character issues.

For instance, if you have a directory called Stéphane but that é is encoded in the iso8859-1 (aka latin1) charset (0xe9 byte) as was most common in Western Europe and the America up until the mid 2000s, then that 0xe9 byte is not a valid character in UTF-8. So, in UTF-8 locales, the * wildcard (with some find implementations) will not match Stéphane as * is 0 or more characters and 0xe9 is not a character.

$ locale charmap
UTF-8
$ find . -maxdepth 2
.
./St?phane
./St?phane/Chazelas
./Stéphane
./Stéphane/Chazelas
./John
./John/Smith
$ find . \( ! -path './*/*' -o -prune \)
.
./St?phane
./St?phane/Chazelas
./St?phane/Chazelas/age
./St?phane/Chazelas/gender
./St?phane/Chazelas/address
./Stéphane
./Stéphane/Chazelas
./John
./John/Smith

My find (when the output goes to a terminal) displays that invalid 0xe9 byte as ? above. You can see that St<0xe9>phane/Chazelas was not pruned.

You can work around it by doing:

LC_ALL=C find . \( ! -path './*/*' -o -prune \) -extra-conditions-and-actions

But note that that affects all the locale settings of find and any application it runs (like via the -exec predicates).

$ LC_ALL=C find . \( ! -path './*/*' -o -prune \)
.
./St?phane
./St?phane/Chazelas
./St??phane
./St??phane/Chazelas
./John
./John/Smith

Now, I really get a -maxdepth 2 but note how the é in the second Stéphane properly encoded in UTF-8 is displayed as ?? as the 0xc3 0xa9 bytes (considered as two individual undefined characters in the C locale) of the UTF-8 encoding of é are not printable characters in the C locale.

And if I had added a -name '????????', I would have gotten the wrong Stéphane (the one encoded in iso8859-1).

To apply to arbitrary paths instead of ., you'd do:

find some/dir/. ! -name . -prune ...

for -mindepth 1 -maxdepth 1 or:

find some/dir/. \( ! -path '*/./*/*' -o -prune \) ...

for -maxdepth 2.

I would still do a:

(cd -P -- "$dir" && find . ...)

First because that makes the paths shorter which makes it less likely to run into path too long or arg list too long issues but also to work around the fact that find can't support arbitrary path arguments (except with -f with FreeBSD find) as it will choke on values of $dir like ! or -print...


The -o in combination with negation is a common trick to run two independent sets of -condition/-action in find.

If you want to run -action1 on files meeting -condition1 and independently -action2 on files meeting -condition2, you cannot do:

find . -condition1 -action1 -condition2 -action2

As -action2 would only be run for files that meet both conditions.

Nor:

find . -contition1 -action1 -o -condition2 -action2

As -action2 would not be run for files that meet both conditions.

find . \( ! -condition1 -o -action1 \) -condition2 -action2

works as \( ! -condition1 -o -action1 \) would resolve to true for every file. That assumes -action1 is an action (like -prune, -exec ... {} +) that always returns true. For actions like -exec ... \; that may return false, you may want to add another -o -something where -something is harmless but returns true like -true in GNU find or -links +0 or ! -name '' or -name '*' (though note the issue about invalid characters above).

  • 2
    Someday I will run into a bunch of Chinese files and I'll be very glad I've read your many answers about locale and valid characters. :) – Wildcard Dec 14 '16 at 23:37
  • 2
    @Wildcard, you (and even more so a Chinese person) is more likely to run into problem with British, French... file names than Chinese file names as Chinese filenames are more often encoded in UTF-8 than file names of alphabetical scripts that can generally be covered by a single-byte charset which was the norm up until relatively recently. There are other multi-byte charsets to cover Chinese character, but I'd expect Chinese people would have switched to UTF-8 earlier than westerners as those charsets have a number of nasty issues. See also the edit for an example. – Stéphane Chazelas Dec 15 '16 at 09:22
  • How does your boolean expression equivalent for -maxdepth 2 work? Assume, I have PATH_1=/a/b/c/d and PATH_2=/a/b/c/d/e, and I'm going -maxdepth 2 from /a/b, and d and e are files, not dirs.

    ! negates, so it would negate the outcome of the -path primary. With -path '/a/b/*/*', PATH_1 matches and is true then negated to false. The -o causes us to go -prune, but d is a file so nothing happens. The path is then printed to stdout.

    PATH_2 doesn't match -path '/a/b/*/*' and isfalse, then negated totrue`. What happens now?

    – Ungeheuer Aug 17 '21 at 23:56
  • Apologies, the comment above looks like a wall of text. I tried adding line breaks with SHIFT+ENTER and it didn't work out :( I appreciate your help in understanding this. – Ungeheuer Aug 17 '21 at 23:57
  • @Ungeheuer, not sure what you mean if d is not a directory, then find will never come across a d/e file. – Stéphane Chazelas Aug 18 '21 at 06:09
  • My bad! Wasn't paying attention PATH_1=/a/b/c/d, PATH_2=/a/b/c/e/f. e is a directory at the same level as d, and f is the regular file. – Ungeheuer Aug 18 '21 at 20:31
  • 1
    @Ungeheuer, -prune returns true (like GNU find's -true) with the added side effect that if applied on a file of type directory, find will not descend into it. So if you do find /a/b/., /a/b/./c/e will match -path '*/./*/*' and so -prune will be applied to it and find will not look for files into it, so will never come across /a/b/./c/e/f – Stéphane Chazelas Aug 19 '21 at 06:38
9

You can use -path to match a given depth and prune there. Eg

find . -path '*/*/*' -prune -o -type d -print

would be maxdepth 1, as * matches the ., */* matches ./dir1, and */*/* matches ./dir1/dir2 which is pruned. If you use an absolute starting directory you need to add a leading / to the -path too.

meuh
  • 51,383
  • Hmmm, tricky. Couldn't you just remove one layer of /* from the end of the pattern, take out the -ooperator, and get the same result? – Wildcard Apr 11 '16 at 15:33
  • No, because * matches / as well, so the dir a/b/c/d/e would fit -path */*, sadly. – meuh Apr 11 '16 at 15:42
  • But a/b/c/d/e would never be reached, because -prune would be applied to a/b.... – Wildcard Apr 11 '16 at 15:46
  • 1
    Sorry, I misread that -prune and -o were removed. If you keep the -prune the problem is that the */* will not match anything at a level above the maxdepth, eg the single directory a. – meuh Apr 11 '16 at 15:55
0

I ran into an issue where I needed a way to limit depth when searching multiple paths (instead of just .).

For example:

$ find dir1 dir2 -name myfile -maxdepth 1

This led me to an alternate approach using -regex. The gist is:

-regex '(<list of paths | delimited>)/<filename>'

So, the above would be:

$ find dir1 dir2 -name myfile -regextype awk -regex '(dir1|dir2)/myfile' # GNU
$ find -E dir1 dir2 -name myfile -regex '(dir1|dir2)/myfile' # MacOS BSD

Without a filename:

$ find dir1 dir2 -name myfile -maxdepth 1 # GNU

-regex '(<list of paths | delimited>)/<anything that's not a slash>$'

$ find dir1 dir2 -name myfile -regextype awk -regex '(dir1|dir2)/[^/]*$' # GNU
$ find -E dir1 dir2 -name myfile -regex '(dir1|dir2)/[^/]*$' # MacOS BSD

Finally, for -maxdepth 2 the regex changes to: '(dir1|dir2)/([^/]*/){0,1}[^/]*$'

Alissa H
  • 101
  • 1
    This question asks for a standard (as in POSIX) solution though. Also -maxdepth would work with multiple search paths. – Kusalananda Dec 19 '18 at 08:56