0

This is in continuation of Storing array as Environment variable to non-interactive shell - Unix & Linux Stack Exchange. Kusalananda, asked me to ask another question.


I want to find all markdown files that contain a particular regex match, and then sort the output with the files (that contain the most search term) in increasing order.

GREP(){
export Regex
xargs -0 -I "{}" bash -c 'grep "${GrepOption[@]}" --only-matching --with-filename --extended-regexp --regexp="${Regex}" "${1}" 2> /dev/null | wc -l | xargs printf "${1}:%s\n" ' _ "{}" \;
}

find . -name "*.md" -print0 | GREP | grep -v ':0$' | sort -n -r -k2 -t:

The user inputs "${Regex}". Here GrepOptions are array of options that depends on the user input. For example GrepOptions can be GrepOptions=("--ignore-case") or it can be some other array of grep options.

But I am unable to have the array GrepOptions be available in the subshell environment.

Any suggestions?

Porcupine
  • 1,892
  • Why does grepOptions need to be an array? Doesn't the problem go away if you just use GrepOptions="--ignore-case --something-else" and then run bash -c 'grep $grepOptions? Do you control the values that can be stored there? – terdon Feb 26 '21 at 16:45
  • Because sometimes there might not be any grepOptions. And then that will result in this grep "" – Porcupine Feb 26 '21 at 16:52
  • Yes, exactly. So? That isn't a problem, and even if it were, you would have exactly the same problem if using an array. – terdon Feb 26 '21 at 16:54
  • Your suggestion: /bin/grep: unrecognized option '--with-filename --line-number' – Porcupine Feb 26 '21 at 16:55
  • Yes, you quoted the variable. This is one of the very rare cases where you should not quote (assuming you control the variable's contents). – terdon Feb 26 '21 at 17:00
  • But, in a general case a solution with quoting is better. – Porcupine Feb 26 '21 at 17:14
  • I hope it's not just a simple typo of GrepOptions (in the text) versus GrepOption (in the code)? – Jeff Schaller Feb 26 '21 at 18:18
  • I see that you export Regex, but I don't see an export of GrepOption(s); is it exported? – Jeff Schaller Feb 26 '21 at 18:18
  • 1
    @JeffSchaller It's an array. Can't be an environment variable. – Kusalananda Feb 26 '21 at 18:38

2 Answers2

1

The issue is that you want to use GrepOptions, an array, as an environment variable in your code. You can't do that since arrays can't be exported.

Instead, you will have to pass the options in to you bash -c script along with the pathname that you want to run grep on.

Below, I've taken it a bit further and also pass the regular expression, and more than a single found pathname, and I'm doing this from -exec in find rather than with xargs.

I'm using -- in the call of the bash -c script to delimit the user options from the pathnames.

find . -name '*.md' -type f -exec bash -c '
    re=$1; shift
    while [[ $1 != "--" ]]; do
        opts+=( "$1" )
        shift
    done; shift
for pathname do
    printf "%s:" "$pathname"
    grep -o -E -e "$re" "${opts[@]}" -- "$pathname" |
    wc -l | tr -d "[:blank:]"
done | grep -v ":0$"' bash "$user_regex" "${user_options[@]}" -- {} + |

sort -t : -k2,2n

This find all regular files with a filename suffix of .md in the current directory or below. For batches of such files, a bash script is executed which takes a user-supplied extended regular expression ($user_regex), some user-supplied options for the grep command ($user_options, an array), along with the batch of pathnames.

The in-line script picks out the regular expression and the user options and then proceeds to loop over the found files, running grep on each and counting the number of lines that are returned.

The pathname of each file is outputted with this number at the end, after a : character.

Output indicating no matches are weeded out and the overall result is sorted.

Due to the way pathnames are being treated by this code, it will not support pathnames containing newlines or colons. The user_options array also can't contain a lone double dash.

Kusalananda
  • 333,661
-1
bash -c 'grep ${grepOptions+$grepOptions}  ....'
  • grepOptions is a shell variable housing the options to grep, space separated.
  • naking use of ${var+alternative} ; in case options are nonnull then use them else an e empty unquoted is used which dissolves in parsing
  • It is impled that shell variables are exportable.
guest_7
  • 5,728
  • 1
  • 7
  • 13