0

I have some troubles getting a good understanding of the way Bash manages sub-shell creation and the related scoping issues. Hope someone can bring some coherence to my thoughts on this matter.

The first thing I don't get is the way variables are handled in sub-shells. I thought variables weren't inherited, unless they had the eXport flag on. However this doesn't seem to be true for sub-shells created by grouping:

$ n=3
$ ( pstree $$ ; echo $n )
bash---bash---pstree
3

If, instead, I create the sub-shell manually everything goes as expected:

$ export ppid=$$
$ bash
$ pstree $ppid
bash---bash---pstree
$ echo $n

Also some grouping doesn't actually create a sub-shell:

( pstree $$ )
bash---pstree

And some grouping that shouldn't, in fact does:

$ pstree $$ &
[1] 1685
$ bash---pstree
{ pstree $$; } &
[1] 1687
$ bash---bash---pstree

This seems a bit messy to me, with a lot of special cases to keep track of. And it gets more messy. Consider process substitution:

$ cat <(pstree $$)
bash-+-bash---pstree
    `-cat

Here, for what is my understanding, bash executed cat in a sub-process, giving it a FIFO to read from. Then forked a sub-shell giving it the other end of the FIFO. The sub-shell ran pstree.

Consider now:

$ pstree $$ > >(cat)  # There is some non determinism involved, output may be different
bash---pstree---bash---cat
#or sometimes
bash---pstree---bash

Here bash seems to do something different. It forks a sub-shell, the sub-shell forks another sub-shell, and the situation become: bash---bash(1)---bash(2)

bash(1) (which got the write end of the FIFO) execs pstree and bash(2) (with write end of the FIFO) runs cat in a sub-process.

So in the first case the sub-process gets executed by a sub-shell of the primary shell. In the second by a shell sub-process of the primary command.

In my opinion the 2nd case is because pstree may run before cat is created.

  • What are the other outputs of pstree $$ > >(cat)? – ctrl-alt-delor Feb 09 '20 at 09:58
  • Note that $$ will the not be updated for subshells. Use $BASHPID instead. – Kusalananda Feb 09 '20 at 10:39
  • 1
    For a quirk related to > >(...), see this. –  Feb 09 '20 at 13:14
  • 1
    Your Q is too broad, touches on too many issues. It also makes some unwarranted asssumptions, that I'll try to correct: a) variables should be exported in order to be inherited through exec, not through fork b) the fact that bash runs subshells in separate processes is an implementation artifact, and bash is free to take liberties with it: for instance, (pwd) is identical to pwd, because bash optimizes (pwd) to (exec pwd). c) there isn't any grand design principle behind the shell language, its operations grew "organically", and they're not always consistent. –  Feb 09 '20 at 13:19

1 Answers1

1

I will attempt to cover most of it.

When you do a sub-shell (), $(), <(): bash will call fork(). This will create a new child process, that is exactly the same as the parent, except pid, ppid, and return value of fork (0 for child; positive pid, of child for parent; negative for error -- no child created). Therefore the sub-shell will have the same state / the same variables.

When you call bash, the shell will call fork() then it will call exec("bash") (actually one of the variants). This replaces the cloned bash with a new image, that starts running from the start. Hence variables cleared, and config files re-read.

When you use &. Bash calls fork() to but the process in the background. It seems like it is forking one more time than needed, but is probably using the forked bash to help manage the jobs. (Why not forking is cheep-ish )

For the last case pstree $$ > >(cat). (This took me longer to work out):
Bash forks, as it always does, to run a new process. Before calling exec it needs to redirect stdin/out/err (in this case just stdout). In doing so it has to run cat in a sub-shell, so it does. Now cat is a child of the new bash. And the new-bash a child of the old. Next the new-bash calls exec, and becomes pstree.

  • Thanks for the answers. Regarding the non determinism, somtime I get bash---pstree---bash instead of bash---pstree---bash---cat. In my opinon it is because pstree may run before cat is created. – Jorge Lopez Feb 09 '20 at 10:49
  • Yes I think your explanation is correct: cat has not get started. Process creation, and what order they run in is non-deterministic. This does not normally matter. You only see it because you are trying to introspect it. – ctrl-alt-delor Feb 09 '20 at 12:04
  • You are looking at implementation. You should look at what the effect of the commands is. With Unix the implementation is closely tied to the behaviour. So looking at it can help, however it is not what matters, when trying to understand how to use a system. Top-down vs Bottom-up learning: Going from the top (human centred) to the bottom (machine centred) is the better way to learn what a system does, and how to use it. Going the other way will help you better understand how it dose it. I recommend starting at the top going down (learn what), then coming back up (learn how). – ctrl-alt-delor Feb 09 '20 at 12:11