5

For benchmarking, I ran the command:

for i in {1..100000000}; do 
  echo "$i" line >> file
done

Bash expanded the braces and stored the list 1 2 3 4 5 6 ... 100000000 in memory.

I thought this will somehow deallocated at a point. After all, it's a temporary variable. A couple of days have since passed and the bash process is still occupying 17.9GB of memory.

Can I force bash to clear these temporary variables? I cannot use unset, because I don't know the variable name. (obviously unset i doesn't help)

Of course, a solution is to close the shell and open a new one.

I also asked about this on the bash mailing list and got a helpful reply from Chet Ramsey:

That's not a memory leak. Malloc implementations need not release memory back to the kernel; the bash malloc (and others) do so only under limited circumstances. Memory obtained from the kernel using mmap or sbrk and kept in a cache by a malloc implementation doesn't constitute a leak. A leak is memory for which there is no longer a handle, by the application or by malloc itself.

malloc() is basically a cache between the application and the kernel. It gets to decide when and how to give memory back to the kernel.

Sebastian
  • 8,817
  • 4
  • 40
  • 49
  • Also, for larger lists, use for (( i=1 ; i<=100000000 ; i++ )) ; do – choroba Sep 10 '14 at 11:32
  • Yes @choroba, I learned that (it's linked). Now I'm just curious if it's possible to tell bash to deallocate the variable. – Sebastian Sep 10 '14 at 11:34
  • Smells like a bug. I would report this to the bash maintainers. – phemmer Sep 10 '14 at 12:43
  • i dont believe you have to close the shell, per se, but you might replace it. Maybe try : exec ${0#-} $-? – mikeserv Sep 10 '14 at 14:01
  • 2
    This is not an issue with bash, but with how operating systems handle memory allocation. Memory allocated to a process is typically not reclaimed by the OS, but kept by the process to be reused later, rather than constantly returning memory to and reallocating memory from the OS. – chepner Sep 13 '14 at 18:45
  • @patrick see the edit to my question for the reply from the bash-users mailinglist. It's actually not a bash issue at all. cheers – Sebastian Sep 17 '14 at 07:16
  • 2
    Completely unrelated: as a poor student still suffering with Sandybridge and 4 GB of RAM, I'm completely awed by nonchalantly having that amount of memory occupied but still use the system as normal, however commercial/server-specced the system is. Is this how poor people feel when they see rich people literally burning their money? – Oxwivi Sep 17 '14 at 12:09
  • 1
    @Oxwivi student here too! But I have access to a machine used for CAE applications with 192 gib ram. It is, however, idle atm. Cheers! – Sebastian Sep 17 '14 at 12:13
  • Hahah, that's still pretty incredible. Keep up the good job "studying" on that system, mate! – Oxwivi Sep 17 '14 at 12:22

2 Answers2

6

So, I did this thing in testing and, yeah, it consumes a lot of memory. I pointedly used a smaller number as well. I can imagine that bash hogging those resources for days on end could be a little irritating.

ps -Fp "$$"; : {1..10000000}; ps -Fp "$$"

UID        PID  PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
mikeserv 32601  4241  0  3957  3756   4 08:28 pts/1    00:00:00 bash -l
UID        PID  PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
mikeserv 32601  4241 59 472722 1878712 4 08:28 pts/1   00:00:28 bash -l

As you can see, there is a significant impact on the process's consumed resources. Well, I'll try to clear that, but - as near as I can tell - it will require replacing the shell process with another.

First, I'll set a marker variable just to show it comes with me. Note: this is not exported.

var='just
testing
'\''
this stuff
'\'''

Next I'll exec $0. This is the same kind of thing that long-running daemons must do occasionally to refresh their state. It makes sense here.

FIRST METHOD: HERE-DOCUMENT

I'll use the current shell to build a heredoc input file-descriptor for the newly execed shell process that will contain all of the current shell's declared variables. Probably it can be done differently but I don't know all of the proper commandline switches for bash.

The new shell is invoked with the -login switch - which will see to it that your profile/rc files are sourced per usual - and whatever other shell options are currently set and stored in the special shell parameter $-. If you feel -login is not a correct way to go then using the -i switch instead should at least get the rc file run.

exec "${0#-}" "-l$-" 3<<ENV
$(set)
ENV

Ok. That only took a second. How did it work?

. /dev/fd/3 2>/dev/null
echo "$var"; ps -Fp "$$"

just
testing
'
this stuff
'
UID        PID  PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
mikeserv 32601  4241 12  4054  3800   5 08:28 pts/1    00:00:29 bash -lhimBH

Just fine, as it would seem. <&3 hangs on the new shell process's input until it is read - so I do so with . and source it. It will likely contain some default read-only variables which have already been set in the new shell by its rc files and etc and so there will be a few errors - but I dump that to 2>/dev/null. After I do that, as you can see, I have all of the old shell process's variables here with me - to include my marker $var.

SECOND METHOD: ENVIRONMENT VARIABLES

After doing a google or two on the matter, I think this may be another way worth considering. I initially considered this, but (apparently erroneously) discounted this option based on the belief that there was a kernel-enforced arbitrary length-limit to a single environment variable's value - something like ARGLEN or LINEMAX (which likely will affect this) but smaller for a single value. What I was correct about, though, is that an execve call will not work when the total environment is too large. And so, I believe this should be preferred only in the case you can guarantee that your current environment is small enough to allow for an exec call.

In fact, this is different enough that I'll do it all again in one go.

ps -pF "$$"; : {1..10000000}; ps -pF "$$"

UID        PID  PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
mikeserv 26296  4241  0  3957  3788   3 14:28 pts/1    00:00:00 bash -l
UID        PID  PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
mikeserv 26296  4241 38 472722 1878740 3 14:28 pts/1   00:00:11 bash -l

One thing I failed to do on the first go-round is migrate shell functions. Not counting keeping track of them yourself (which is probably the best way), to the best of my knowledge there is no shell-portable way to do this. bash does allow for it, though, as declare -f works in much the same way for functions that set does for shell variables portably. To do this as well with the first method you need only add ; declare -f to set in the here-document.

My marker variable will remain the same, but here's my marker function:

chk () {
    printf '###%s:###\n%s\n' \
        \$VAR "${var-NOT SET}" \
        PSINFO "$(ps -Fp $$)" \
        ENV\ LEN "$(env | wc -c)"
}

And so rather than feeding the new shell a file-descriptor, I will instead hand it two environment variables:

varstate=$(set) fnstate=$(declare -f) exec "${0#-}" "-l$-"

Ok. So I've just replaced the running shell, so now what?

chk
bash: chk: command not found

Of course. But...

{   echo '###EVAL/UNSET $FNSTATE###'
    eval "$fnstate"; unset fnstate
    chk
    echo '###EVAL/UNSET $VARSTATE###'
    eval "$varstate"; unset varstate
    chk
}

OUTPUT

###EVAL/UNSET $FNSTATE###
###$VAR:###
NOT SET
###PSINFO:###
UID        PID  PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
mikeserv 26296  4241 10  3991  3736   1 14:28 pts/1    00:00:12 bash -lhimBH
###ENV LEN:###
6813
###EVAL/UNSET $VARSTATE###
bash: BASHOPTS: readonly variable
bash: BASH_VERSINFO: readonly variable
bash: EUID: readonly variable
bash: PPID: readonly variable
bash: SHELLOPTS: readonly variable
bash: UID: readonly variable
###$VAR:###
just
testing
'
this stuff
'
###PSINFO:###
UID        PID  PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
mikeserv 26296  4241 10  4056  3772   1 14:28 pts/1    00:00:12 bash -lhimBH
###ENV LEN:###
2839
mikeserv
  • 58,310
  • 1
    that is some mighty kung-fu you're capable of, sir. Thanks for the answer, I can test it only tomorrow. – Sebastian Sep 10 '14 at 16:23
  • alright, I tested it and it worked. The variable $var was preserved in bash 4.3.24, but not in bash 4.1.2. The memory was cleared in both versions. – Sebastian Sep 11 '14 at 14:28
  • @Sebastian - Weird, that - the version difference. It could be more reliably done with a tmp file as well. like set >/tmp/state; exec ...; . /tmp/state. It should work in any shell on a system with the /dev/fd/[num] links, but bash, unfortunately, as your own question can testify, often subtly does the unexpected. – mikeserv Sep 11 '14 at 14:35
  • you are right, using the temp file in bash version 4.1.2 preserves $var. – Sebastian Sep 11 '14 at 14:40
  • @Sebastian - if you would humor me - I wonder - in the older version - what of command exec ... 3<<ENV...? Do you still lose $var? – mikeserv Sep 11 '14 at 14:42
  • sorry for the confusion, does the edit of the comment answer your question? – Sebastian Sep 11 '14 at 14:44
  • @Sebastian - no, I got that the tmp file would work, but I was curious if perhaps the here document might as well if the shell's parser saw command first rather than exec. – mikeserv Sep 11 '14 at 14:51
  • yes, running command exec ${0#-} -l$- 3<<ENV ... will lose $var. – Sebastian Sep 11 '14 at 14:56
  • 1
    @Sebastian - dang... thanks for trying though. The older version must be dropping 3< or else its exec doesn't properly interpret the 3<< redirection. You're a pretty good sport, you know that? In any case, I am pleased that you found the solution to your liking. If you ever find yourself writing a while : loop into a shell script which you intend to run indefinitely, maybe remember this and instead consider how you might arrange to have the script more robustly refresh its state with exec $0 at a reasonable interval. – mikeserv Sep 11 '14 at 14:59
  • @Sebastian - I think the second method described in my update would work for either bash version. – mikeserv Sep 11 '14 at 21:39
0

No. Bash will never return memory it allocates for any purpose to the operating system. (But please correct me if I'm wrong.)

However, bash will re-use the memory for other purposes if necessary, and if not, the kernel will swap it out, so it won't actually be in RAM.