4

I am using the following command to retrieve the number of files which names contains sv or json in a given directory in a remote server:

nbs_files=`ssh -q -i ${sshkey} ${user}@${server} "find ${path}/ -maxdepth 1 -mindepth 1 -type f -name '*sv*' -o -name '*.json' -exec basename {} \; | wc -l"` 

This command returns only the number of .json files, whereas files with sv in their names exist in the ${path}.

When I remove the -o -name '*.json' part, the command works well, and returns the number of files containing the 'sv' in their names.

Does anyone know how can I modify the command in order to retrieve the files containing sv in their names and the files with the extension .json as well?

AdminBee
  • 22,803
rainman
  • 149

1 Answers1

5

The problem is and/or operator precedence in the find expression. Specifically, the implicit AND between adjacent tests has higher precedence than the OR (-o) between the two name tests. So the test expression gets parsed as:

    -maxdepth 1 -mindepth 1 -type f -name '*sv*'
OR
    -name '*.json' -exec basename {} \;

...and since the -name '*.json' is the only one that's part of the same branch as -exec, the -exec only runs for json files.

The solution is to override the normal precedence with explicit parentheses around the -name tests:

nbs_files=$(ssh -q -i ${sshkey} ${user}@${server} "find ${path}/ -maxdepth 1 -mindepth 1 -type f '(' -name '*sv*' -o -name '*.json' ')' -exec basename {} \; | wc -l")

BTW, I also took the liberty of replacing the backticks with $( ) -- they're the more modern option, are easier to read, and don't have the same weird escaping anomalies that backticks have. See this question and BashFAQ #82.

  • 2
    Of course one has to wonder what directory names the OP had to warrant running basename before counting the results. If there are no embedded newlines then removing the -exec basename {} \; will give the same results but without the cost of running a process per matching file. If the OP is using gnu find - likely given the linux tag, then nbs_files=$(ssh -q -i ${sshkey} ${user}@${server} "find ${path}/ -maxdepth 1 -mindepth 1 -type f '(' -name '*sv*' -o -name '*.json' ')' -printf '%f\n' | wc -l") will avoid the process per matching file even with embedded newlines in the directory names. – icarus Jul 05 '20 at 07:02
  • @icarus, though filenames with newlines will affect the count. You don't really need to print the actual filenames if all you're going to do is pipe to wc. Just something like find ... -printf '\n' | wc -l would do – ilkkachu Jul 05 '20 at 15:14
  • @ilkkachu The point is that to reproduce the behavior you do need to print the filenames, you can't just print a newline for each file. Note I am not saying that the behavior is the desired one, I strongly suspect it is not, but your solution answers a different problem. – icarus Jul 05 '20 at 18:33
  • @icarus, well, they did say "retrieve the number of files", so I thought they wouldn't want names with newlines to count as two. Or was there something else I missed? – ilkkachu Jul 05 '20 at 18:37