50

I have the code

file="JetConst_reco_allconst_4j2t.png"
if [[ $file == *_gen_* ]];
then
    echo "True"
else
    echo "False"
fi

I test if file contains "gen". The output is "False". Nice!

The problem is when I substitute "gen" with a variable testseq:

file="JetConst_reco_allconst_4j2t.png"
testseq="gen"
if [[ $file == *_$testseq_* ]];
then
    echo "True"
else
    echo "False"
fi

Now the output is "True". How could it be? How to fix the problem?

Viesturs
  • 943
  • 3
  • 12
  • 16

4 Answers4

39

Use the =~ operator to make regular expression comparisons:

#!/bin/bash
file="JetConst_reco_allconst_4j2t.png"
testseq="gen"
if [[ $file =~ $testseq ]];
then
    echo "True"
else
    echo "False"
fi

This way, it will compare if $file has $testseq on its contents.

user@host:~$ ./string.sh
False

If I change testseq="Const":

user@host:~$ ./string.sh
True

But, be careful with what you feed $testseq with. If the string on it somehow represents a regex (like [0-9] for example), there is a higher chance to trigger a "match".

Reference:

Zebiano
  • 105
34

You need to interpolate the $testseq variable with one of the following ways:

  • $file == *_"$testseq"_* (here $testseq considered as a fixed string)

  • $file == *_${testseq}_* (here $testseq considered as a pattern).

Or the _ immediately after the variable's name will be taken as part of the variable's name (it's a valid character in a variable name).

Kusalananda
  • 333,661
24
file="JetConst_reco_allconst_4j2t.png"
testseq="gen"

case "$file" in
    *_"$testseq"_*) echo 'True'  ;;
    *)              echo 'False'
esac

Using case ... esac is one of the simplest ways to perform a pattern match in a portable way. It works as a "switch" statement in other languages (bash, zsh, and ksh93 also allows you to do fall-through in various incompatible ways). The patterns used are the standard file name globbing patterns.

The issue you are having is due to the fact that _ is a valid character in a variable name. The shell will thus see *_$testseq_* as "*_ followed by the value of the variable $testseq_ and an *". The variable $testseq_ is undefined, so it will be expanded to an empty string, and you end up with *_*, which obviously matches the $file value that you have. You may expect to get True as long as the filename in $file contains at least one underscore.

To properly delimit the name of the variable, use "..." around the expansion: *_"$testseq"_*. This would use the value of the variable as a string. Would you want to use the value of the variable as a pattern, use *_${testseq}_* instead.

Another quick fix is to include the underscores in the value of $testseq:

testseq="_gen_"

and then just use *"$testseq"* as the pattern (for a string comparison).

Kusalananda
  • 333,661
  • So the shell will be looking for a variable $testseq_ and not find it and substitute it with an empty string. – Viesturs Jun 13 '17 at 14:39
  • 1
    @Viesturs That's is the heart of the issue, yes. – Kusalananda Jun 13 '17 at 14:39
  • 1
    For a substring search it should be *"$testseq"* for case like for [[...]] (except for zsh unless you enable globsubst) – Stéphane Chazelas Mar 28 '19 at 11:05
  • Simpler than [ "${str##*substr*}" ] || echo True ? –  Nov 28 '19 at 20:18
  • 1
    @Isaac In terms of reading and understanding what's happening, yes. It's also easy to extend one test with more test cases without getting an "if-then-elif-then-elif" spaghetti. Although testing a single string the way you show (whether a string disappears in a substitution) is shorter. – Kusalananda Nov 28 '19 at 20:54
6

For the portable way to test if an string contains a substring, use:

file="JetConst_reco_allconst_4j2t.png";       testseq="gen"

[ "${file##*$testseq*}" ] || echo True Substring is present

Or "${file##*"$testseq"*}" to avoid interpreting glob characters in testseq.

  • You'd need something like [ "${file##$testseq}" != "$file" ] because in dash that is Remove Largest Prefix Pattern. – Noel Grandin Feb 26 '20 at 10:24
  • No, @NoelGrandin there is no change on most shells (including dash), the Largest Prefix Pattern will be the whole string if the variable subpattern ($testseq) value is contained inside the $file value. Try: dash -c 'file="JetConst_reco_allconst_4j2t.png"; testseq="reco"; echo "=${file##*"$testseq"*}="' to confirm that dash will remove the whole string. –  Feb 26 '20 at 18:53