2

I am using the find and grep commands.

Getting quite confused about when multiple options are joined by "or" with the -o flag and the use of grouped parentheses, and when grouped parentheses are not used.

When using find, grouped parentheses seem necessary

find $fdir ( -name *.texi -o -name *.org )

When using grep, grouped parentheses not used

grep --include "*.texi" --exclude "*.org"
Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
Pietru
  • 389
  • 1
  • 17

4 Answers4

14

Most programs, including grep don't treat parenthesis as arguments specially. If you did this:

grep "(" --include "*.texi" --exclude "*.org" ")"

grep would treat the first ( as the pattern to search for, and the last ) as a filename.(*) Same as if they were foo and bar instead. So, you can't group options to grep.


But here's the thing: -name, -type, -o, and ( etc. aren't options to find. It does take some options, namely -P/-H/-L, which affect symlink processing, but these aren't options. Instead, they're part of the search expression, which is a thing specific to find. (**)

Emphasis on expression there. When you give find the expression ( -name *.texi -o -name *.org ) it's more like the C-like expression

( patternmatch(filename, "*.texi") || patternmatch(filename, "*.texi") )

than anything else. And find evaluates that expression for each file it sees. If you had e.g. this instead:

( -name *.texi -o -name *.org ) -printf something

You'd need the parens, because without them:

-name *.texi -o -name *.org -printf something

would be the same as

-name *.texi -o -name *.org -a -printf something

because there's an implied and between atoms unless -o is given, and then the expression would be

patternmatch(...) || patternmatch(...) && printf(...)

and the and operation binds tighter than the or operation, exactly in the same way it does in pretty much all programming languages, and in the same way multiplication binds tighter than addition. And find can't know what you wanted, because it supports arbitrary expressions.(***) So, in this case, it wouldn't work like you want without the parens.


As others noted, the command you have doesn't need parens, since if there are no "actions" (-print, -exec etc.) in the find expression, it defaults to printing matching filenames, and also implicitly puts parenthesis around the expression.

So,

find "$fdir" -name "*.texi" -o -name "*.org"

acts like

find "$fdir" \( -name "*.texi" -o -name "*.org" \) -print

but if you explicitly put the -print there, you also need to explicitly put the parenthesis to get the processing order right. See: `find` with multiple `-name` and `-exec` executes only the last matches of `-name`


Going back to grep: grep doesn't take parens, and doesn't need them, since it doesn't process expressions. It has no concept of nesting or operators like and and or in general. Instead, it has hard-coded behaviours. With --include and --exclude, I think it tries to fulfil both the include and exclude rules at the same time. (Or, at least one of the individual --include rules and none of the individual --exclude rules.) But with multiple search patterns, it's enough to match one, or another. Both of these are static rules: you can't give it a more complicated expression of which patterns should match.


(* GNU grep would take the middle ones as options, other implementations might take them as filenames too, as the non-option argument earlier stopped option processing. Also, you need to quote or escape the parens to prevent their special meaning to the shell; that's unrelated to what grep does with them.)

(** In the same way that it's specific to grep that the first non-option argument is a pattern, and only the rest are filenames, or that the last argument to mv is a destination while the others are files to move, and it's specific to git what it does with whatever arguments it takes. The tools do different things, so they have to use the command line arguments in different ways.)

(*** Someone once said that evaluating expressions is the main thing find does. That is to say, it doesn't find filenames to print them, it goes through a tree of files to evaluate an expression on them. Printing and running external commands is just a side-effect.)

ilkkachu
  • 138,973
  • Then I have to change my plan when using grep because I end up with enclosing parentheses when using "${isufx[@]}". I got grep -hir "${esufx[@]}" "${isufx[@]}" "${ictx[@]}" "$@". Got to simplify my code with a different school of thought. – Pietru Jul 26 '21 at 10:34
  • 1
    @Pietru, the parens on grep also wouldn't mean anything, since it doesn't handle arbitrary expressions. It just collects all the include and exclude rules together, and does what it does. Which I think boils down to something like (match on any include) and not (match on any exclude) – ilkkachu Jul 26 '21 at 10:44
  • 2
    @Pietru some of the confusion might be naming-related — {} aren’t parentheses, they’re curly braces (or brackets). – Stephen Kitt Jul 26 '21 at 19:34
  • @Pietru, I'm not sure if this is what you were thinking, but since Stephen mentioned it, let's clarify that the braces that are part of the array syntax, in "${array[@]}" are in no way related to what goes to the command (e.g. find's {} for -exec), and neither do the parenthesis in a command substitution ($(...)) have anything to do with the parens in a find expression. You can use opts=-qv; grep "${opts}" hello filename, or opts=(-q -v); grep "${opts[@]}" hello filename, even if grep itself doesn't process parens or braces. – ilkkachu Jul 27 '21 at 12:56
7

There’s no general rule: what a given command recognises as arguments depends on the command. grep doesn’t use parentheses for grouping arguments.

For find, parentheses are only needed when the precedence of and versus or needs to be overridden; this is similar to the use of parentheses in mathematics. In your example, they are not needed, because the default precedence gives the expression the same overall meaning:

find "$fdir" -name '*.texi' -o -name '*.org'
terdon
  • 242,166
Stephen Kitt
  • 434,908
  • Seems to be necessary with find $fdir -name *.texi -o -name *.org "$@" – Pietru Jul 26 '21 at 11:36
  • 3
    @Pietru never use patterns unquoted like that. If you have any files matching the patterns in the current directory, then the patterns will be expanded to the matching files by the shell before launching find, so find will never see the pattern but only its expansion in the top level directory. Always quote: find "$fdir" -name "*.texi" -o -name "*.org" "$@". – terdon Jul 26 '21 at 11:57
  • @Pietru yes, if you add arguments with "$@" you need to ensure that the resulting overall expression still reflects your intent; find can’t guess that for you — it doesn’t even know that the first part is hard-coded and the second isn’t. – Stephen Kitt Jul 26 '21 at 19:32
6

The reason for the confusion is that there is no standard on how command-line parameters are interpreted by programs. In general, interpretation of parameters is left to the programmer, though the GNU coding standard recommends that programs use the getopt() and getopt_long() functions (from the GNU C library) for that purpose.

That means that the interpretation of the parentheses to define operator precedence is a function of find, and not of the shell used to invoke find. The programmers of grep "simply" didn't implement their options-parsing algorithm that way, so grep wouldn't understand this notation in the first place.

Note, however that parentheses have a special meaning in the shell: They denote that the enclosed content is a command to be run in a sub-shell. So, the command as you posted it should actually not work; as mentioned by @terdon, you have to escape the parentheses (like \( ... \)) in order to have the shell pass them to find in the first place.

AdminBee
  • 22,803
0

I think the manual has some insights on how find uses the parenthesis that can help you:

2.12 Combining Primaries With Operators

Operators build a complex expression from tests and actions. The operators are, in order of decreasing precedence:

( expr )

Force precedence. True if expr is true.

! expr
-not expr

True if expr is false. In some shells, it is necessary to protect the ‘!’ from shell interpretation by quoting it.

expr1 expr2 expr1 -a
expr2 expr1 -and expr2

And; expr2 is not evaluated if expr1 is false.

expr1 -o expr2 expr1 -or expr2

Or; expr2 is not evaluated if expr1 is true.

expr1 , expr2

List; both expr1 and expr2 are always evaluated. True if expr2 is true. The value of expr1 is discarded. This operator lets you do multiple independent operations on one traversal, without depending on whether other operations succeeded. The two operations expr1 and expr2 are not always fully independent, since expr1 might have side effects like touching or deleting files, or it might use -prune’ which would also affect expr2`.

find searches the directory tree rooted at each file name by evaluating the expression from left to right, according to the rules of precedence, until the outcome is known (the left hand side is false for -and, true for -or), at which point find moves on to the next file name.

  • 3
    Note that the question doesn't only ask about find, but also grep, and why parentheses are used with find sometimes, but not with e.g. grep. – Kusalananda Jul 26 '21 at 10:24
  • Indeed, as the other answers address the question fully, I thought that adding some information on how find uses the parenthesis would be helpful. I was doubting if it should be a comment. I trust your opinion on this matter. – schrodingerscatcuriosity Jul 26 '21 at 10:30