2

On what basis does the role of asterisk keep changing?

CASE 1: 
var1=abcd-1234-defg 
echo ${var1#*-*}        # RESULT: 1234-defg

CASE 2: 
stringZ=abcABC123ABCabc 
echo `expr match "$stringZ" '\(abc[A-Z]*.2\)'`  # RESULT: abcABC12

When and how is the role of asterisk decided?

CASE 3:
path_name="/home/bozo/ideas/thoughts.for.today"
echo ${path_name##/*/}  # RESULT : thoughts.for.today

In this case I mistook / to be playing the role of escape character here, i.e., trying to escape the basic characteristic of *. Well, I was wrong. So how are the roles of these special characters decided and by whom?

CASE 4: 
var1=abcd--1234-defg 
echo ${var1#*-*}        # RESULT: -1234-defg  & i was expecting 1234-defg

The CASE 4 is similar to CASE 1, but with a difference as can be seen abcd--, and I was expecting 1234-defg, but the result turned out to be the same as in case 1.

This is how I interpreted *-* in CASE 4:

The shell would look for everything from the start of the var1 till it finds - OR -- OR ---

Why is my interpretation in the context of CASE 4 incorrect?

2 Answers2

3

That's because * had different meaning in your tests.

In case 1, case 3 and case 4, it's use as a pattern matching. While in case 2, it's a regular expression metacharacter (or quantifier or Kleene star).

In pattern matching, * character will match any string, include null string, a* will match any string start with a, example a, aa, ab, but not b

In regular expression, * quantifier match zero or more occurrences of preceding token, a* will match zero or more a sequences, example '', a, aa, aaa, ab, b.

With these in mind, your case 4 will be interpreted as match any string contain -, because it was used as a pattern matching, your explain in question is used as a regular expression.

So abcd--1234-defg, the shortest sub string match *-* is abcd-, and the longest match is the whole string. Since when you use the form ${var1#*-*}, which is a Parameter Expansion to remove the shortest prefix in $var1 match *-*, you got -1234-defg, because the shortest prefix matched is abcd-.

cuonglm
  • 153,898
2

There are two things to understand:

  1. In a glob, * matches zero or more of any character

  2. The form ${var1#*-*} removes the shortest match.

Thus, ${var1#*-*} will only remove up to the first dash because that is the shortest match.

For completeness, note ${var1##*-*} would remove the longest match.

Examples

In each case below, the shortest matching prefix is removed:

$ var1=abcd-1234-defg 
$ echo ${var1#*}
abcd-1234-defg
$ echo ${var1#*-}
1234-defg
$ echo ${var1#*-*}
1234-defg
$ echo ${var1#*-*-}
defg
$ echo ${var1#*-*-*}
defg

Contrast the above with the ## case which removes the longest matching prefix:

$ echo ${var1##*-}
defg
$ echo ${var1##*-*}

$ 

Documentation

From man bash:

${parameter#word}
${parameter##word}

Remove matching prefix pattern. The word is expanded to produce a pattern just as in pathname expansion. If the pattern matches the beginning of the value of parameter, then the result of the expansion is the expanded value of parameter with the shortest matching pattern (the # case) or the longest matching pattern (the ## case) deleted. If parameter is @ or *, the pattern removal operation is applied to each positional parameter in turn, and the expansion is the resultant list. If parameter is an array variable subscripted with @ or *, the pattern removal operation is applied to each member of the array in turn, and the expansion is the resultant list. [Emphasis added.]

John1024
  • 74,655