10

I have a directory ~/Documents/machine_learning_coursera/.

The command

find . -type d -name '^machine'

does not find anything

I also tried

find . -type d -regextype posix-extended -regex '^machine'

so as to match the the beginning of the string and nothing.

I tried also with -name:

find . -type d -regextype posix-extended -regex -name '^machine'

and got the error:

find: paths must precede expression: `^machine'

What am I doing wrong here?

Quasímodo
  • 18,865
  • 4
  • 36
  • 73
moth
  • 307

3 Answers3

23

find's -name takes a shell/glob/fnmatch() wildcard pattern, not a regular expression.

GNU find's -regex non-standard extension does take a regexp (old style emacs type by default), but that's applied on the full path (like the standard -path which also takes wildcards), not just the file name (and are anchored implicitly)

So to find files of type directory whose name starts with machine, you'd need:

find . -name 'machine*' -type d

Or:

find . -regextype posix-extended -regex '.*/machine[^/]*' -type d

(for that particular regexp, you don't need -regextype posix-extended as the default regexp engine will work as well)

Note that for the first one to match, the name of the file also needs to be valid text in the locale, while for the second one, it's the full path that needs to be valid text.

11

The argument for -name is much like a shell glob, so it's implicitly bound to the start and end of the filename it's matching, and you need * to indicate there may be more to come:

find . -type d -name 'machine*'

For your alternatives, -regex is documented (see man find) as "a match on the whole path, not a search", also implicitly bound to the start and end of the match, so you'd need

find . -type d -regextype posix-extended -regex '(.*/)?machine[^/]*'

When you say that you "tried also with -name", you forgot that -regex requires a parameter, so it used -name as that parameter and then choked on the unexpected '^machine'

Chris Davies
  • 116,213
  • 16
  • 160
  • 287
  • "The argument for -name is much like a shell glob": Is it in any way different to a shell glob? I had always thought it was a shell glob, just one that is handled by find and not by the shell. – terdon Jan 26 '21 at 10:03
  • @terdon it doesn't handle braces, which are supported by some shells such as bash – Chris Davies Jan 26 '21 at 10:04
  • 1
    not that I know of, just learning some bash. thanks! – moth Jan 26 '21 at 10:04
  • 1
    '(.*/)?machine.*' would also match on files inside those directories as it matches on ./machine/some/dir as well. The (, and )? are unnecessary as all file paths (except . itself) start with ./. – Stéphane Chazelas Jan 26 '21 at 11:27
  • @StéphaneChazelas ah yes, thank you. I've modified the .* appropriately. I've left the (.*/) because it will match even when someone else changes the starting directory to something other than . such that machine matches at the top level – Chris Davies Jan 26 '21 at 12:27
  • well, braces {} aren't really part of glob patterns, but a different thing, regardless of the find man page confusing them... {a,b} expands to a, b even if those files don't exist, unlike what happens with a glob. – ilkkachu Jan 26 '21 at 12:37
  • @ilkkachu if the man pages confuse them what hope have learners got? – Chris Davies Jan 26 '21 at 13:05
6

The -name test of find, does not take regular expressions, it takes file globs. The ^ has no special meaning in globs, so your command is looking for directories actually named ^machine. You want this, which will find all directories whose name starts with machine:

find . -type d -name 'machine*'

Your other attempt failed because -regex is like -name. It isn't a flag to enable, you need to pass it a regular expression as input. Here you gave it nothing, so find complained. Also, note that the -regex test will try to match the entire path, not just the name. The right way to do what you want using the -regex test would be something like:

find . -maxdepth 1 -type d -regex '.*/machine.*' 
terdon
  • 242,166
  • globbing is filename generation, a feature of the shell that takes a wildcard pattern and uses it to generate a list of file paths that match them using a special algorithm. Note that ^ is a zsh extendedglob operator (negation there) – Stéphane Chazelas Jan 26 '21 at 11:30
  • @StéphaneChazelas yes, in bash also, but not in the basic glob expansion used by find as far as I know. And while globbing is indeed a shell feature, it is also a find feature and find uses a very similar mechanism (identical?) to regular, simple file globbing. Or is that wrong? – terdon Jan 26 '21 at 11:38
  • 1
    find doesn't generate a list of files, it's -name just does pattern matching. It descends the directory tree regardless of whether you use -name or any other condition predicate. A shell expanding a foo*/*bar glob, based on that pattern will list the current directory, find the directory files that match foo* and then list those to find the files that match *bar for instance. shell globbing, the case construct and find's -name/-path all use shell wildcards (also often called glob patterns, as that comes fro the glob utility from Unix V1), but they are very different things. – Stéphane Chazelas Jan 26 '21 at 12:00
  • @StéphaneChazelas So, this pattern matching (any official term for it?) done by find command in find . -type d -name 'machine*' differs from its regex variant in 2 ways - a) usage of -regex option b) we are limited to use shell wildcards only when regex option not used. – Number945 Jan 28 '21 at 18:38
  • 1
    @Number945 -name 'machine*' will match machine, machine1, machineasdkajhsd;jahsdd and anything else starting with machine. If it were a regex, machine* only matches machin, machinee, machinee, machineeeeeee etc. – terdon Jan 28 '21 at 18:40
  • @Number945, there's no -regex option, there are -name <wildcard-pattern>, -path <wildcard-pattern>, -regex <regexp> predicates. On BSDs, there is a -E option that has the same effect as GNU's -regex-type posix-extended option predicate. – Stéphane Chazelas Jan 28 '21 at 18:41