Is the behavior of bash -c "" documented?

Question

It's quite known that when running bash -c "COMMAND" (at least in the common versions in Linux) when there's a single command without any metacharacter (except for space, tab or newline), the bash -c process will not fork, but rather replace itself by executing COMMAND directly with execve system call for optimization, so the result will only be one process.

$ pid=$$; bash -c "pstree -p $pid"
bash(5285)───pstree(14314)

If there's any metacharacter (such as redirection) or more than one command (which requires a metacharacter anyway), bash will fork for every command it executes.

$ pid=$$; bash -c ":; pstree -p $pid"
bash(5285)───bash(28769)───pstree(28770)
$ pid=$$; bash -c "pstree -p $pid 2>/dev/null"
bash(5285)───bash(14403)───pstree(14404)

Is this an undocumented optimization feature (which means it's not guaranteed), or is it documented somewhere and guaranteed?

Note: I assume that not all versions of bash behave like that and that on some versions that do, it's just considered an implementation details and not guaranteed, but I wonder if maybe there are at least some bash versions that do explicitly support this and document the condition for this. For instance, if there's a single ; character after the command, without any second command, bash will still execve without forking.

$ pid=$$; bash -c "pstree -p $pid ; "
bash(17516)───pstree(17658)

Background to my question

As I mentioned, this behavior is quite well known^{1 2} by experienced bash users, and I'm familiar with it for a long time.

Some days ago I encountered the following comment to Interactive bash shell: Set working directory via commandline options where @dave_thompson_085 wrote:

bash automatically execs (i.e. replaces itself with) the last (or only) command in -c.

I responded that it's only true if there's a single command. But then I wondered: Are there some versions of bash where maybe the last command is execed and not forked, even if there's another command before it? And in general, are there cases this behavior is guaranteed? Do certain bash versions expose (and elaborate on) this feature outside of the source code?

Additional references

@ConstantineA.B. Maybe reliable is the wrong word; Can I trust it will always happen under the conditions I specified in the question? I mean, without me having to explicitly run the with exec before (ie bash -c "exec <COMMAND>")? — aviro, Jan 10 '24 at 10:58
No, you can't. On my Arch, with bash version 5.2.21, the command bash -c ":; pstree -p $pid" doesn't fork but behaves the same as bash -c "pstree -p $pid". — terdon, Jan 10 '24 at 11:09
Does this answer your question? Why is there no apparent clone or fork in simple bash command and how it's done? — muru, Jan 10 '24 at 11:44
You do? Then it should be obvious that it is not at all standard. What do you mean by trustworthy? Have you looked at the answers there, in particular the one that looks at the bash source code? — muru, Jan 10 '24 at 12:00
Mm, I don't know your context, and how exactly it matters if the shell just replaces itself with the last command or not, but if you want it to do that, I suppose you could just explicitly run exec somecommand..., and if you don't want it to happen, you could always put some extra command after the one want to make sure is forked. — ilkkachu, Jan 10 '24 at 13:38

Chris Down · Accepted Answer · 2024-01-10T23:14:52.023

Using exec is an implementation detail -- if it's not documented, then the behaviour is not guaranteed in any version. Maybe it's useful to go a little over why we tend not to document these kinds of things based on my experience developing other widely used software (the Linux kernel).

In general, for any widely used piece of software, one generally tries to only describe the features, operation, and standard behaviors. One typically avoids delving into the specifics of internal optimisations for fear of creating downstream dependencies on that behaviour.

As one example from my own field of work, I work on kernel memory management, and we don't seek to document in great detail (for example) exactly how reclaim prioritisation internals work, or exactly when things like LRU scanning are performed. Documenting that would make future decisions on optimisations and other changes much more complicated to make, because now we have to consider if we are breaking some intangible, unknown downstream dependency.

Similarly, in the case of bash or any other complex software system, internal mechanisms like the decision to use exec instead of forking a new process can be subject to change, even at very short notice. Imagine if there are new performance optimisations, security considerations, or compatibility issues that requires not using exec suddenly. Do we really want to be bound by contract to continue operating in that way? Of course not.

In bash, and most other widely used projects, there's a strong emphasis on maintaining a stable and well-defined external interface that contains only the things deemed necessary to be exposed, allowing the internal implementation to evolve as needed. This approach ensures that the software remains robust, secure, and efficient, without sacrificing the ability to innovate and improve.

So to answer your question, no, this isn't documented, and you shouldn't rely on it. It's also highly unlikely that this or other similar cases would be documented, because adding constraints on internals would make development a lot more rigid.

As it happens, dash has been known to go back and forth on that (skipping the fork for the last command in inline script). ksh used to skip the fork even when there was a trap on EXIT installed which meant it was not run then. — Stéphane Chazelas, Jan 10 '24 at 13:15
Excellent answer, but I can't stand mentioning the Hyrum's law ;-) — kostix, Jan 11 '24 at 15:45
@kostix While that's true (and we do try to avoid breaking those cases in kernel), it at least affords more freedom than making everything explicitly ABI :-) — Chris Down, Jan 11 '24 at 22:12

Is the behavior of bash -c "" documented?

Background to my question

Additional references

1 Answers1