-3

I want to pass multiple patterns to grep, and have a number of solutions of how to apply the -e option to each pattern.

The different patterns are stored in the array ptrn. I wonder whether blanks in the array elements could be misinterpreted, as the -e would not be passed separately to the pattern.

The possibilities are

Possibility 1

mptrn=$( printf -- ' -e %s' "${ptrn[@]}" )
grep -E "$mptrn" -- "$flnm"

Possibility 2

for i in "${!ptrn[@]}"; do
  ptrn[$i]="-e ${ptrn[$i]}"
done
grep -E "${ptrn[@]}" -- "$flnm"

Possibility 3

eptrn=()
for i in "${!ptrn[@]}"; do
  eptrn+=("-e" "${ptrn[$i]}")
done
grep -E "${eptrn[@]}" -- "$flnm"

What could be the possible problems with any of the solutions ?

Vera
  • 1,223

2 Answers2

2

Let's say ptrn=(" a b" " 333 22 1 ").

Answer 1:

mptrn=$( printf -- ' -e %s' "${ptrn[@]}" )
grep -E "$mptrn" -- "$flnm"

The entire output of the command will be assigned to mptrn, and the used format won't allow distinguishing between elements in the array if they contain space characters. mptrn will contain -e a b -e 333 22 1 (with both a leading and trailing space character). When executing grep, the quoted argument will be interpreted as a pattern argument because of the leading space, which will search for occurrences of "$mptrn" in the given files, which is not what you're trying to achieve. To make a similar approach work you can use the IFS parameter.

[ "${IFS+x}" = x ] && OLD_IFS=$IFS
IFS=$(printf '\x7f')
mptrn=$(printf "-e${IFS}%s${IFS}" "${ptrn[@]}")
# note that we don't use quotes
grep -E $mptrn -- "$flnm"
if [ "${OLD_IFS+x}" = x ]; then IFS=$OLD_IFS; else unset IFS; fi

Answer 2:

for i in "${!ptrn[@]}"; do
  ptrn[$i]="-e ${ptrn[$i]}"
done
grep -E "${ptrn[@]}" -- "$flnm"

This answer is close to the desired result, except that the searched patterns will include a leading space, which means the resulting command looks like this:

grep -E "-e  a b" "-e  333 22 1 " -- file

This can be solved by simply removing the space: ptrn[$i]=-e${ptrn[$i]}.


Answer 3:

eptrn=()
for i in "${!ptrn[@]}"; do
  eptrn+=("-e" "${ptrn[$i]}")
done
grep -E "${eptrn[@]}" -- "$flnm"

This answer is correct, the resulting grep command will be:

grep -E -e " a b" -e " 333 22 1 " -- file
don_aman
  • 1,373
0

Combine them into one regex. e.g. if you want to match any of the patterns, you can use the regex alternation operator | as an OR.

First, create a join_by function in bash:

join_by () { 
    local d="$1";
    shift;
    printf '%s' "$1";
    shift;
    printf '%s' "${@/#/$d}"
}

This is a fairly simplistic clone of perl's built-in join function. It uses the first arg as the delimiter, then all remaining args are joined using that delimiter, without an extra annoying leading or trailing delimiter.

Then use it like this:

$ p=(a b c d e)
$ re=$(join_by '|' "${p[@]}")
$ echo $re
a|b|c|d|e

You can $re with grep -E (alternation requires Extended Regular Expressions, "ERE").

grep -E "$re" filename

If you need to do more complicated boolean operations (e.g. a && b && ! (c || d || e) then use perl or awk. bash is a terrible choice of language for data processing.

cas
  • 78,579
  • Reading the documentation it looks that "-E" option is only to allow Regex-ERE. And one is still required to use -e for defining patterns as it is showing -e PATTERN. – Vera Mar 30 '23 at 10:20
  • -e is not required with grep unless you use multiple separate patterns. grep foo filename is exactly the same as grep -e foo filename. Also, grep -e foo -e bar filename is the same as grep -E 'foo|bar' filename. – cas Mar 30 '23 at 10:53
  • And thus mptrn=$(printf '|%s' "${ptrn[@]}") ; mptrn="${pa:1}" ; grep -E "$mptrn" filename should be perfectly acceptable. Correct? – Vera Mar 30 '23 at 12:07
  • close enough, assuming you meant mptrn=${mptrn:1} instead of ${pa:1}. dunno where the $pa variable came from. – cas Mar 31 '23 at 01:07
  • btw, I don't know if they make easy sense to you or not, but I find that the abbreviations you use (ptrn, flnm) make your code much harder to read - vowel-less words are readable but take more cognitive effort. I'd advise using really short variable names (like p, mp, f) for temporary/"throw away" variables and full names (pattern, filename. pluralised if they're array vars) for variables that need to be in use for longer. – cas Mar 31 '23 at 01:14