0

I'm creating PowerShell and Bash scripts to standardise our usage of the former's Get-ChildItem and Select-String and the latter's grep.

As part of the Bash script, I'm taking command-line arguments, parsing the comma-delimited values for the filename includes (plural), and trying to pass them to Grep's --include= but encountering all sorts of difficulties.

Initially, I was trying to use brace expansion, but I abandoned this because (1) I couldn't get it to work and (2) I read that this technically isn't supported by grep and the proper solution is to use multiple includes anyway.

Now, I'm trying to use those multiple includes which I have managed to get working, but only if the value doesn't contain a space - if it does then the script does nothing, presumably because the values aren't quoted, but I haven't been able to get a quoted version working, even though copying and pasting the output of $grepstring in the shell works fine.

Here's a simplified version of the script:

#!/bin/bash

include="$1"

if [[ $include == "," ]]; then IFS=',' read -r -a includearray <<< "$include"

includemulti=&quot;&quot;

firstloop=&quot;yes&quot;

for element in &quot;${includearray[@]}&quot;
do
    # Trim leading and trailing whitespace
    element=&quot;${element## }&quot;
    element=&quot;${element%% }&quot;

    if [[ &quot;$firstloop&quot; == &quot;yes&quot; ]]; then
        firstloop=&quot;no&quot;
        includemulti+=&quot;--include=$element&quot;
        # includemulti+=&quot;--include=\&quot;$element\&quot;&quot;
        # includemulti+=&quot;--include='&quot;$element&quot;'&quot;
        # includemulti+='--include=&quot;'$element'&quot;'
        # includemulti+='--include=&quot;'&quot;$element&quot;'&quot;'
        # includemulti+=&quot;--include='$element'&quot;
    else
        includemulti+=&quot; --include=$element&quot;
        # includemulti+=&quot; --include=\&quot;$element\&quot;&quot;
        # includemulti+=&quot; --include='&quot;$element&quot;'&quot;
        # includemulti+=' --include=&quot;'$element'&quot;'
        # includemulti+=' --include=&quot;'&quot;$element&quot;'&quot;'
        # includemulti+=&quot; --include='$element'&quot;
    fi
done

grep -ERins $includemulti &quot;&lt;pattern&gt;&quot; &quot;&lt;path&gt;&quot;

grepstring=&quot;grep -ERins $includemulti \&quot;&lt;pattern&gt;\&quot; \&quot;&lt;path&gt;\&quot;&quot;
echo $grepstring

else grep -ERins --include="$include" "<pattern>" "<path>" fi

Does work:

bash ~/test.sh 'Hello*.txt, *.sh'

bash ~/test.sh 'Hello W*.txt'

Does not work:

bash ~/test.sh 'Hello W*.txt, *.sh'

I'm starting to wonder if it's just easier to call grep multiple times with one include for each one...

  • "I was trying to use brace expansion, but I abandoned this because […] I read that this technically isn't supported by grep" - it's nothing to do with grep. A suitable shell will expand a brace pattern and supply the result as a set of arguments to whatever command is present. – Chris Davies Jul 03 '23 at 17:10
  • Your includemulti variable splits, hence the issue. – annahri Jul 03 '23 at 20:55
  • see also How can we run a command stored in a variable? (which also contains the case where it's just the command arguments and explains why it happens) – ilkkachu Jul 05 '23 at 12:13
  • you can brace expand e.g. --include={foo,bar} into --include=foo --include=bar. just that combining that with any sort of a variable is difficult to impossible in Bash (but maybe easier in other shells, though possibly still fishy) – ilkkachu Jul 05 '23 at 12:15

2 Answers2

1

Analysis

When your input is: 'Hello W*.txt, *.sh', spaces would be used as delimiters. So your includemulti would be split to three words:

  • --include=Hello
  • W*.txt
  • --include=*.sh

If you add set -x in your script before the grep command you'll see exactly how it gets executed and confirm what I said:

+ grep -ERins --include=Hello 'W*.txt' '--include=*.sh' <pattern> <path>

Even if you change the includemulti+= line and add quotes around the elements:

includemulti+=" --include=\"$element\""

It won't help, because bash will still use the spaces as word delimiters:

+ grep -ERins '--include="Hello' 'W*.txt"' '--include="*.sh"' <pattern> <path>

Solution 1

One possible solution that require less changes in your script would be to add the quotes around the the elements, and add the eval builtin before your grep command. From the bash man pages:

eval [arg ...]

The args are read and concatenated together into a single command. This command is then read and executed by the shell, and its exit status is returned as the value of eval. If there are no args, or only null arguments, eval returns 0.

So if you add eval before your grep command, effectively it would be like running:

bash -c 'grep -ERins --include="Hello W*.txt" --include="*.sh" <pattern> <path>'

And with set -x before the grep command you'll see two lines, the second one is the one actually getting executed:

+ eval grep -ERins '--include="Hello' 'W*.txt"' '--include="*.sh"' <pattern> <path>
++ grep -ERins '--include=Hello W*.txt' '--include=*.sh' <pattern> <path>

Solution 2

This is the more elegant solution. Instead of your loop, you modify the includearray array variable:

# Remove leading space from every element in the array
includearray=("${includearray[@]## }")
# Remove trailing space from every element in the array
includearray=("${includearray[@]%% }")
# Add --include= as the prefix of every element in the array                                                                                                                
includearray=("${includearray[@]/#/--include=}")

And then your grep command will look as follows:

grep -ERins "${includearray[@]}" <pattern> <path>

When you do it that way, you don't need to surround your elements by quotes, since every element in the includearray array will be treated as a single word (regardless of any spaces it has).

So your final code is:

#!/bin/bash

include="$1"

if [[ $include == "," ]]; then IFS=',' read -r -a includearray <<< "$include"

# Remove leading space from every element in the array
includearray=(&quot;${includearray[@]## }&quot;)
# Remove trailing space from every element in the array
includearray=(&quot;${includearray[@]%% }&quot;)
# Add --include= as the prefix of every element in the array                                                                                                                
includearray=(&quot;${includearray[@]/#/--include=}&quot;)

grep -ERins &quot;${includearray[@]}&quot; &quot;&lt;pattern&gt;&quot; &quot;&lt;path&gt;&quot;

else grep -ERins --include="$include" "<pattern>" "<path>" fi

aviro
  • 5,532
1

Rather than building a string and applying eval to it, which is prone to unexpected errors due to quoting, whitespace, and other issues, you can use an array to build the set of arguments to grep:

#!/bin/bash
#
includesList="$1"
IFS=, read -ra includes <<<"$includesList"
# echo "! includesList=$includesList, includes=(${includes[@]}) !" >&2
[ 0 -eq "${#includes[@]}" ] && { echo "ERROR: Missing includes" >&2; exit 1; }

args=() for include in "${includes[@]}" do include="${include## }" include="${include%% }" args+=('--include' "$include") done

echo "! args=(${args[@]}) !" >&2

grep -EIins "${args[@]}" "<REpattern>" "<path>"

Usage:

chmod a+rx code

./code 'trick' ./code 'trick,house,truck.sh' ./code 'Hello W.txt, *.sh'

If you uncomment the two debugging echo statements you may be initially confused by the lack of quotes. The values themselves do not contain quotes to mark them out, so with your last example you'll get this:

! includesList=Hello W*.txt, *.sh, includes=(Hello W*.txt  *.sh) !
! args=(--include Hello W*.txt --include *.sh) !

Because it's a simplistic debugging statement it's not possible to see visually which of the the args() values are which. There are actually four array elements, --include x2 along with Hello W*.txt and *.sh. A more sophisticated "print array" routine could use a printf '%q' approach to output values suitably quoted, but I felt that was overkill here:

{ printf '! args=('; printf "'%q' " "${args[@]}"; printf ') !\n'; } >&2
Chris Davies
  • 116,213
  • 16
  • 160
  • 287
  • Thanks, roaima. I've copy and pasted your code, and I get the same echo output, but, again, the actual grep still doesn't work / output anything, unfortunately. – mythofechelon Jul 05 '23 at 10:33
  • @mythofechelon I'm sorry to hear that. It works for me with my examples. You did fix up <REpattern> (previously <pattern>) and <path> for the grep itself...? – Chris Davies Jul 05 '23 at 11:53