44

I have a general question, which might be a result of misunderstanding of how processes are handled in Linux.

For my purposes I am going to define a 'script' as a snippet of bash code saved to a text file with execute permissions enabled for the current user.

I have a series of scripts that call each other in tandem. For simplicity's sake I'll call them scripts A, B, and C. Script A carries out a series of statements and then pauses, it then executes script B, then it pauses, then it executes script C. In other words, the series of steps is something like this:

Run Script A:

  1. Series of statements
  2. Pause
  3. Run Script B
  4. Pause
  5. Run Script C

I know from experience that if I run script A until the first pause, then make edits in script B, those edits are reflected in the execution of the code when I allow it to resume. Likewise if I make edits to script C while script A is still paused, then allow it to continue after saving changes, those changes are reflected in the execution of the code.

Here is the real question then, is there any way to edit Script A while it is still running? Or is editing impossible once its execution begins?

4 Answers4

26

In Unix, most editors work by creating a new temporary file containing the edited contents. When the edited file is saved, the original file is deleted and the temporary file renamed to the original name. (There are, of course, various safeguards to prevent dataloss.) This is, for example, the style used by sed or perl when invoked with the -i ("in-place") flag, which is not really "in-place" at all. It should have been called "new place with old name".

This works well because unix assures (at least for local filesystems) that an opened file continues to exist until it is closed, even if it is "deleted" and a new file with the same name is created. (It's not coincidental that the unix system call to "delete" a file is actually called "unlink".) So, generally speaking, if a shell interpreter has some source file open, and you "edit" the file in the manner described above, the shell won't even see the changes since it still has the original file open.

[Note: as with all standards-based comments, the above is subject to multiple interpretations and there are various corner-cases, such as NFS. Pedants are welcome to fill the comments with exceptions.]

It is, of course, possible to modify files directly; it's just not very convenient for editing purposes, because while you can overwrite data in a file, you cannot delete or insert without shifting all following data, which would imply quite a lot of rewriting. Furthermore, while you were doing that shifting, the contents of the file would be unpredictable and processes which had the file open would suffer. In order to get away with this (as with database systems, for example), you need a sophisticated set of modification protocols and distributed locks; stuff which is well beyond the scope of a typical file editing utility.

So, if you want to edit a file while its being processed by a shell, you have two options:

  1. You can append to the file. This should always work.

  2. You can overwrite the file with new contents of exactly the same length. This may or may not work, depending on whether the shell has already read that part of the file or not. Since most file I/O involves read buffers, and since all the shells I know read an entire compound command before executing it, it is pretty unlikely that you can get away with this. It certainly wouldn't be reliable.

I don't know of any wording in the Posix standard which actually requires the possibility of appending to a script file while the file is being executed, so it might not work with every Posix compliant shell, much less with the current offering of almost- and sometimes-posix-compliant shells. So YMMV. But as far as I know, it does work reliably with bash.

As evidence, here's a "loop-free" implementation of the infamous 99 bottles of beer program in bash, which uses dd to overwrite and append (the overwriting is presumably safe because it substitutes the currently executing line, which is always the last line of the file, with a comment of exactly the same length; I did that so that the end result can be executed without the self-modifying behaviour.)

#!/bin/bash
if [[ $1 == reset ]]; then
  printf "%s\n%-16s#\n" '####' 'next ${1:-99}' |
  dd if=/dev/stdin of=$0 seek=$(grep -bom1 ^#### $0 | cut -f1 -d:) bs=1 2>/dev/null
  exit
fi

step() {
  s=s
  one=one
  case $beer in
    2) beer=1; unset s;;
    1) beer="No more"; one=it;;
    "No more") beer=99; return 1;;
    *) ((--beer));;
  esac
}
next() {
  step ${beer:=$(($1+1))}
  refrain |
  dd if=/dev/stdin of=$0 seek=$(grep -bom1 ^next\  $0 | cut -f1 -d:) bs=1 conv=notrunc 2>/dev/null
}
refrain() {
  printf "%-17s\n" "# $beer bottles"
  echo echo ${beer:-No more} bottle$s of beer on the wall, ${beer:-No more} bottle$s of beer.
  if step; then
    echo echo Take $one down, pass it around, $beer bottle$s of beer on the wall.
    echo echo
    echo next abcdefghijkl
  else
    echo echo Go to the store, buy some more, $beer bottle$s of beer on the wall.
  fi
}
####
next ${1:-99}   #
rici
  • 9,770
  • When I run this, it starts with "No more", then continues to -1 and into the negative numbers indefinitely. – Daniel Hershcovich Sep 04 '13 at 06:50
  • If I do export beer=100 before running the script, it works as expected. – Daniel Hershcovich Sep 04 '13 at 07:10
  • @DanielHershcovich: quite right; sloppy testing on my part. I think I fixed it; it now takes an optional count parameter. A better and more interesting fix would be to automatically reset if the parameter doesn't correspond with the cached copy. – rici Sep 04 '13 at 16:14
23

bash goes a long way to make sure it reads commands just before executing them.

For instance in:

cmd1
cmd2

The shell will read the script by blocks, so likely read both commands, interpret the first one and then seek back to the end of cmd1 in the script and read the script again to read cmd2 and execute it.

You can easily verify it:

$ cat a
echo foo | dd 2> /dev/null bs=1 seek=50 of=a
echo bar
$ bash a
foo

(though looking at the strace output on that, it seems it does some more fancy things (like read the data several times, seek back...) than when I tried the same a few years ago, so my statement above about lseeking back may not apply anymore on newer versions).

If however you write your script as:

{
  cmd1
  cmd2
  exit
}

The shell will have to read up to the closing }, store that in memory and execute it. Because of the exit, the shell will not read from the script again so you can edit it safely while the shell is interpreting it.

Alternatively, when editing the script, make sure you write a new copy of the script. The shell will keep reading the original one (even if it's deleted or renamed).

To do that, rename the-script to the-script.old and copy the-script.old to the-script and edit it.

8

There is really no safe way to modify the script while it is running because the shell can use buffering to read the file. In addition, if the script is modified by replacing it with a new file, shells will typically only read the new file after performing certain operations.

Often, when a script is changed while executing, the shell ends up reporting syntax errors. This is due to the fact that, when the shell closes and reopens the script file, it uses the byte offset into the file to reposition itself on return.

ash
  • 7,260
4

You could get around this by setting a trap on your script, and then using exec to pick up the new script contents. Note however, the exec call starts the script from scratch and not from where it has reached in the running process, and so script B will get called (on so forth).

#! /bin/bash

CMD="$0"
ARGS=("$@")

trap reexec 1

reexec() {
    exec "$CMD" "${ARGS[@]}"
}

while : ; do sleep 1 ; clear ; date ; done

This will continue to display the date on the screen. I could then edit my script and change date to echo "Date: $(date)". On writing that out the running script still just displays the date. How ever if I send the signal that I set the trap to capture, the script will exec (replaces the current running process with the command specified) which is the command $CMD and the arguments $@. You can do this by issuing kill -1 PID - where PID is the PID of the running script - and the output changes to show Date: before the date command output.

You could store the "state" of your script in an external file (in say /tmp), and read the contents to know where to "resume" on the when the program is re-exec'ed. You could then add an additional traps termination (SIGINT/SIGQUIT/SIGKILL/SIGTERM) to clear up that tmp file so when you restart after interrupting the "Script A" it will start from the beginning. A stateful version would be something like:

#! /bin/bash

trap reexec 1
trap cleanup 2 3 9 15

CMD="$0"
ARGS=("$@")
statefile='/tmp/scriptA.state'
EXIT=1

reexec() { echo "Restarting..." ; exec "$CMD" "${ARGS[@]}"; }
cleanup() { rm -f $statefile; exit $EXIT; }
run_scriptB() { /path/to/scriptB; echo "scriptC" > $statefile; }
run_scriptC() { /path/to/scriptC; echo "stop" > $statefile;  }

while [ "$state" != "stop" ] ; do

    if [ -f "$statefile" ] ; then
        state="$(cat "$statefile")"
    else
        state='starting'
    fi

    case "$state" in
        starting)         
            run_scriptB
        ;;
        scriptC)
            run_scriptC
        ;;
    esac
done

EXIT=0
cleanup
Drav Sloan
  • 14,345
  • 4
  • 45
  • 43
  • I've fixed that issue by capturing $0 and $@ at the start of the script and using those variables in the exec instead. – Drav Sloan Aug 29 '13 at 00:46