5

This question focuses on POSIX compliant shell script.

It is common to increment a variable by:

i=3
: $(( i += 1 ))
echo "return code = $?"        # return code = 0
echo "i = $i"                  # i = 4

Is the command $(( i += 1 )) called the "side-effect" of :?

(I read somewhere that : equals to true. I tried replacing : with true or false and it works. If replaced by false, the return code is 1, as expected.)

Why was the value of $i successfully incremented, but assignment does not work in the "side-effect"?

a='four'
: a='five'
echo "return code = $?"        # return code = 0
echo "a = $a"                  # a = four

Sometimes, I see scripts running multiple commands within one command. In most cases, people use this structure to set the IFS variable in IFS='' read -r REPLY. Is this the same structure, called "side-effect"? However, all the side-effects of a=6 have actual effects in the current shell. Assignments work here.

a=6 b=7 c=8
echo "a = $a"                  # a = 6
echo "b = $b"                  # b = 7
echo "c = $c"                  # c = 8

Is this structure called "side-effect"? Where can I find it documented? Or where can i learn about it? I cannot find this structure in the POSIX documentation below.

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09

terdon
  • 242,166
midnite
  • 423
  • Arithmetic expansion is described in https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_04, and it says pretty clearly: "All changes to variables in an arithmetic expression shall be in effect after the arithmetic expansion, as in the parameter expansion "${x=value}"." I don't know if I would call it a side effect - it seems to be the primary and intended effect of the arithmetic expansions you have used. – muru May 27 '23 at 11:51
  • @muru, well, most expansions don't assign new values, and we do a=${b%x} or whatever, and could do i=$(( i+1 )) too. Of course the arithmetic expressions borrow from C, where there's no separate assignment statement, just the operators. If one comes from there, it's natural that way, if from somewhere else, it might be surprising. Anyway, the assignments are side effects in the sense that it's not just about the evaluated value there. (i++ being the worst case, of course) – ilkkachu May 27 '23 at 11:58
  • @ilkkachu hmm, no, I wouldn't call the change in value in the case in the question a side effect though, it's using +=. How can assignment to the target of an assignment operator be a side effect? If it were something like, way, j = ++i or something, sure, I'd say an assignment to i is a side effect of the whole expression, but not for operators like +=. – muru May 27 '23 at 13:12
  • @muru, yes, usually when one uses =, the assignment is considered the main effect. But all of =, += and ++ evaluate to some value as expressions too, in C and in shell arithmetic. In that sense = is no different. f(a = b + 3) is as valid as f(a++), even though the first would likely gather more disapproving comments... – ilkkachu May 27 '23 at 14:59

2 Answers2

15

The : is called the null command. You can find its documentation in man bash:

: [arguments]

No effect; the command does nothing beyond expanding arguments and performing any specified redirections. The return status is zero.

Or, if running a bash shell, with help ::

$ help :
:: :
    Null command.
No effect; the command does nothing.

Exit Status:
Always succeeds.

And it is in the POSIX specs here:

NAME

colon - null utility SYNOPSIS : [argument...]

DESCRIPTION

This utility shall only expand command arguments. It is used when a command is needed, as in the then condition of an if command, but nothing is to be done by the command.

Now, what you call the side effect is the result of what is described in the first quote above: "the command does nothing beyond expanding arguments". The : is a command, so anything after that is an argument. The $(( )) is called "arithmetic expansion", documented in man bash as follows:

Arithmetic Expansion

Arithmetic expansion allows the evaluation of an arithmetic expression and the substitution of the result.

The format for arithmetic expansion is:

$((expression))

The expression is treated as if it were within double quotes, but a double quote inside the parentheses is not treated specially. All tokens in the expression undergo parameter and variable expansion, command substitution, and quote removal. The result is treated as the arithmetic expression to be evaluated. Arithmetic expansions may be nested.

[. . .]

So $(( i += 1 )) means "increment i by one and return the result". However, this needs to be evaluated, to be expanded, and that's why you see it used with :. Since : will expand its arguments, : $(( i += 1 )) means i will be incremented by one. If you were to try to run $(( i += 1 )) alone, that would first be expanded to 4 (in your example) and then the shell would try to execute 4 as a command and return an error:

$ $(( i += 1 ))
bash: 4: command not found

But if you run : $(( i += 1 )) that makes the shell do two things: first, it will apply the arithmetic expansion and make i take the value of 4, and then it would execute : 4 which is a null command, and so returns no error since : ignores its arguments after they've been expanded:

$ : 4
$ 

All this said, I don't really know why you would want to do this instead of the simpler and also POSIX:

$ i=3
$ i=$(( i + 1 ))
$ echo "$i"
4

Or, in bash, but not in POSIX shells:

$ i=3
$ (( i += 1 ))
$ echo "$i"
4

The next point here is that variable=value command is a different beast, and treated in a different way. The relevant section of the POSIX specs here is 2.9.1 Simple Commands, and the most pertinent parts of that are:

A "simple command" is a sequence of optional variable assignments and redirections, in any sequence, optionally followed by words and redirections, terminated by a control operator.

When a given simple command is required to be executed [. . .], the following expansions, assignments, and redirections shall all be performed from the beginning of the command text to the end:

  1. The words that are recognized as variable assignments or redirections according to Shell Grammar Rules are saved for processing in steps 3 and 4.

  2. The words that are not variable assignments or redirections shall be expanded. If any fields remain following their expansion, the first field shall be considered the command name and remaining fields are the arguments for the command.

  3. Redirections shall be performed as described in Redirection.

  4. Each variable assignment shall be expanded for tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal prior to assigning the value.

and:

Variable assignments shall be performed as follows:

[. . .]

  • If the command name is not a special built-in utility or function, the variable assignments shall be exported for the execution environment of the command [...].

[. . .]

  • If the command name is a special built-in utility, variable assignments shall affect the current execution environment. Unless the set -a option is on (see set), it is unspecified:

  • Whether or not the variables gain the export attribute during the execution of the special built-in utility

  • Whether or not export attributes gained as a result of the variable assignments persist after the completion of the special built-in utility

So when the shell reads a command in order to execute it, it will first parse it for anything that looks like a variable assignment (foo=bar) and it will assign the relevant value, and when the command is a "normal" command, the assignment will only affect the running environment of that command, which is why this works:.

$ foo=bar sh -c 'echo "foo is $foo"'
foo is bar
$ echo "foo is $foo"
foo is 

The variable assignment worked and was present in the execution environment of the sh -c command, but it is not present in the parent shell which launched the command. This is just part of how commands are handled by the shell, and is a way to set a variable for a single command only without affecting your shell beyond that.

Whether or not the assignment survives past the command is unspecified in the case where the command is a special builtin, like :. This means that there are some shells out there where var=foo : makes the variable $var available after command has finished, but that isn't POSIX-mandated, and different shells behave in different ways:

$ cat foo.sh 
foo=bar :
echo "foo is $foo"
$ awk -F'/' '/^\//{print $NF}' /etc/shells | sort -u | 
  while read shell; do 
    echo "==== $shell ===="; 
    "$shell" foo.sh; 
done
==== bash ====
foo is 
==== dash ====
foo is bar
==== fish ====
foo is 
==== ksh ====
foo is bar
==== mksh ====
foo is bar
==== rbash ====
foo is 
==== sh ====
foo is bar
==== yash ====
foo is bar
==== zsh ====
foo is 

In any case, all this is precisely why you see while IFS= read ... and constructs like that: you don't want to mess with your $IFS variable, so you change it just to run the specific command for which you need it to have a different value.

Finally, the reason this works:

a=6 b=7 c=8

Is because there are no commands there, just variable assignments, so these variables are indeed assigned in the current shell.

terdon
  • 242,166
  • Just as a side note. I prefer : $(( i += 1 )) over i=$(( i + 1 )) because I prefer the variable i only appears once in the statement. It is less likely to make mistakes when scripts and variables become more complicated. – midnite May 27 '23 at 12:24
  • 2
    "Whether or not the assignment survives past the command is unspecified.". It is, except for functions. See the link to the spec in the OP. Note it makes a difference there whether the command is a special builtin or not, and as it happens, : is a special builtin and true is not. See also Which is more idiomatic in a bash script: `|| true` or `|| :`? – Stéphane Chazelas May 27 '23 at 14:42
  • 2
    @midnite I had an error here. My original test script (foo.sh above) had true. I changed that to : when ilkkachu pointed out another error I had, but like an idiot, didn't actually save the file after making the change, so I thought that all shells behaved the same way with : as well. Turns out they don't, and as Stéphane explained in his answer, POSIX shells and some others apparently do indeed keep the value. See the output for dash, ksh, mksh, yash and sh above. Sorry about that. – terdon May 27 '23 at 20:18
  • Comments have been moved to chat; please do not continue the discussion here. Before posting a comment below this one, please review the purposes of comments. Comments that do not request clarification or suggest improvements usually belong as an answer, on [meta], or in [chat]. Comments continuing discussion may be removed. – terdon May 27 '23 at 20:18
4

Is the command $(( i += 1 )) called the "side-effect" of :?

No. Incrementing the value of i is the side-effect of evaluating the expression (not a command) $(( i += 1 )).

In other words if i is 3, then the expression $(( i += 1 )) expands to the value 4, but it also has the side-effect of storing 4 back into i.

In your case, you're evaluating the expression only for its side-effect; you don't care about the value it expands to. But it still does expand to something, and you couldn't just write
$(( i += 1 )) as a command because that would error with 4: command not found. So we need to throw that value away somehow. That's when we turn to the : command, which does nothing and ignores its arguments. : $(( i += 1)) expands to : 4, which is a command that successfully does nothing. But as a side effect of its arguments having been expanded, i now contains 4.

hobbs
  • 898
  • 6
  • 11