What is the overhead of using subshells?

Question

Hopefully this question is not too generic. I am very new to shell scripting and I come from a computer architecture/non-scripting programming background. I have noticed on the scripts at my work that rarely the scripts are written by making a sub-shell around the entire script. In the scripts I am writing, when I can envelop it with a sub-shell I am since it keeps it from messing with other scripts that call mine (Just in case). Is this not a common-practice because of some overhead associated with this approach? I am having a hard time finding this online.

Example:

#!/bin/bash
( #Start of subshell
echo "Some stuff here"
) #End of subshell

How are those other scripts calling your script? E.g. are they using a source, or are they instead executing your script? — thrig, May 19 '16 at 21:32
Coming from a programming background, there is a post on this site that should be mandatory introductory reading to shell scripting — it covers the conceptual differences between bash and e.g. C. The true answer to this "overhead" question is really a no-answer: If you're worried about performance overhead, you shouldn't be using a shell script. — Wildcard, May 19 '16 at 21:36
Could you please clarify what you mean by "a subshell around the entire script"? somefunction() ( ... ) (notice the parens instead of curlies) to specify that somefunction should always create a subshell is not uncommon, however I don't think there's any need to enclose actual scripts in parentheses. — Petr Skocik, May 19 '16 at 21:36
@thrig That's a good point. ( source somescript ) tends to win some milliseconds over bash somescript and there it makes to achieve the same level of isolation as bash somescript offers. — Petr Skocik, May 19 '16 at 21:40
@LinuxLearner I've never seen that before. I imagine it's only useful if you want to source that script with source somescript and achieve the same effect as ( source somescript ). Apart from that, it'll very very slightly slow down classical execution of that script (not in-sourcing) and maybe make things a little confusing. I think it's a questionable pattern. — Petr Skocik, May 19 '16 at 23:07
Take a look also at process substitution vs pipeline and python vs shell loops if you're curious about performance in shells in general. — Sergiy Kolodyazhnyy, Oct 31 '18 at 00:33

Petr Skocik · Accepted Answer · 2016-05-20T12:22:41.517

11

Subshells do have overhead.

On my system, the minimal fork-exec cost (when you run a program from disk when the file ins't cold) is about 2ms and the minimal forking cost is about 1ms.

With subshells, you're talking the forking cost alone, as no file needs to be execed. If the subshells are kept reasonable low, 1ms is quite negligible in human-facing programs. I believe humans can't notice anything that happens faster than 50ms (and that's how long it tends to take for modern scripting language interpreters to even start (I'm talking python and ruby in rvm here) with the newest nodejs taking up around 100ms).

However, it does add up with loops, and then you might want to replace for example the rather common bactick or $() pattern where you return something from a function by printing it to stdout for the parent shell to catpure with bashisms like printf -v (or use a fast external program to process the whole batch).

The bash-completion package specifically avoid this subshell cost by returning via passed variable names using a technique described at http://fvue.nl/wiki/Bash:_Passing_variables_by_reference

Comparing

time for((i=0;i<10000;i++)); do echo "$(echo hello)"; done >/dev/null

with

time for((i=0;i<10000;i++)); do echo hello; done >/dev/null

should give you a good estimate of what your systems fork-ing overhead is.

edited May 20 '16 at 12:22

answered May 19 '16 at 21:31

Petr Skocik

28,816

1

Thank you. This is actually really useful. I am writing scripts on old laptops. If I am writing a script that is called multiple times I'll avoid using subshells. Even if it is negligible I'm trying to develop good coding practices from start. – LinuxLearner May 19 '16 at 23:12
If you are writing scripts for old laptops, give Perl 5 a real look. Perl will work on almost any old system, if you stick to 5.008 or 5.010 syntax you have access to massively more speed and power than shell. I did the shell route for too many years, for good reasons, but in the end, due in part to needing to always get function returns via subshells, and lack of comlex data structures, I decided that if I need either consistently, I will use Perl. The forking cost is massive when you hit real looping, it's not to be underestimated. Perl is almost 2x faster, roughly. – Lizardx Oct 30 '18 at 18:47
1

I recently replaced some (e.g.) $(echo $text | cut ...) calls that extract substrings from text with ${text##substr} calls and it dramatically reduced execution time. – user208145 Oct 31 '18 at 00:30
Thanks for demonstrating that benchmarking with the time command. Really eye opening! – jmrah Jun 27 '20 at 13:23

score 2 · Answer 2 · edited Oct 30 '18 at 22:59

Running the excellent code provided by PSkocik on my system showed negligible results.

However, this example really hits home - native commands vs subshell commands:

MyPath="path/name.ext" 
# this takes forever
time for((i=0;i<10000;i++)); do echo "$(basename ${MyPath} )"; done >/dev/null 
#this is over 100x less time 
time for((i=0;i<10000;i++)); do echo "${MyPath##*/}"; done >/dev/null

What is the overhead of using subshells?

2 Answers2