How to read the whole shell script before executing it?

Question

Usually, if you edit a scrpit, all running usages of the script are prone to errors.

Example:

sleep 20
echo test

If you execute this script, bash will read the first line (say 10 bytes) and go to sleep. When it resumes, there can be different contents in the script starting at 10-th byte. It might start executing the middle of a different line, in an entirely different if branch. The running script will be broken.

So, how to read the whole shell script before executing it, so that later edits do not affect the running instance?

Maybe there is a way to wrap all the contents in a function or something, so the shell will read the whole script first? But what about the last line where you invoke the function, will it be read till EOF? Maybe omitting the last \n would do the trick? Maybe a subshell () will do? I'm not very experienced with it, please help! — VasyaNovikov, Dec 21 '16 at 07:59
@maulinglawns if the script has contents like sleep 20 ;\n echo test ;\n sleep 20 and I start editing it, it may misbehave. For example, bash could read the first 10 bytes of the script, understand the sleep command and go to sleep. After it resumes, there would be different contents in the file starting at 10 bytes. — VasyaNovikov, Dec 21 '16 at 08:01
So, what you are saying is that you are editing a script that is executing? Stop the script first, do your edits, and then start it again. — , Dec 21 '16 at 08:03
@maulinglawns yes, that's basically it. The problem is, it's not convenient for me to stop the scripts, and it's hard to always remember to do that. Maybe there is a way to force bash reading the whole script first? — VasyaNovikov, Dec 21 '16 at 08:05
Could you briefly stop the scripts running? For example, could you copy a script to a new file (leaving the script running), edit the new copy of the script, then stop the original, replace it with the new, and restart? — John N, Dec 21 '16 at 08:25
@JohnN unfortunately, that wouldn't be possible in my case. Thinking about that.. I have to test how bash behaves if I replace the file, not edit it. Will it follow the file by name, or by reference? By reference (inode?) would be more expected, so that could be a partial way out for me. — VasyaNovikov, Dec 21 '16 at 08:36
it follows the file (inode) as per mmap(2) , so anything that replaces the file instead overwriting the contents is safe. — Jasen, Dec 21 '16 at 09:10

Stéphane Chazelas · Answer 1 · 2016-12-21T22:35:03.847

Yes shells, and bash in particular, are careful to read the file one line at a time, so it works the same as when you use it interactively.

You'll notice that when the file is not seekable (like a pipe), bash even reads one byte at a time to be sure not to read past the \n character. When the file is seekable, it optimises by reading full blocks at a time, but seek back to after the \n.

That means you can do things like:

bash << \EOF
read var
var's content
echo "$var"
EOF

Or write scripts that update themselves. Which you wouldn't be able to do if it didn't give you that guarantee.

Now, it's rare that you want to do things like that and, as you found out, that feature tends to get in the way more often than it is useful.

To avoid it, you could try and make sure you don't modify the file in-place (for instance, modify a copy, and move the copy in place (like sed -i or perl -pi and some editors do for instance)).

Or you could write your script like:

{
  sleep 20
  echo test
}; exit

(note that it's important that the exit be on the same line as }; though you could also put it inside the braces just before the closing one).

or:

main() {
  sleep 20
  echo test
}
main "$@"; exit

The shell will need to read the script up until the exit before starting to do anything. That ensures the shell will not read from the script again.

That means the whole script will be stored in memory though.

That can also affect the parsing of the script.

For instance, in bash:

export LC_ALL=fr_FR.UTF-8
echo $'St\ue9phane'

Would output that U+00E9 encoded in UTF-8. However, if you change it to:

{
  export LC_ALL=fr_FR.UTF-8
  echo $'St\ue9phane'
}

The \ue9 will be expanded in the charset that was in effect at the time that command was parsed which in this case is before the export command is executed.

Also note that if the source aka . command is used, with some shells, you'll have the same kind of problem for the sourced files.

That's not the case of bash though whose source command reads the file fully before interpreting it. If writing for bash specifically, you could actually make use of that, by adding at the start of the script:

if [[ ! $already_sourced ]]; then
  already_sourced=1
  source "$0"; exit
fi

(I wouldn't rely on that though as you could imagine future versions of bash could change that behaviour which can be currently seen as a limitation (bash and AT&T ksh are the only POSIX-like shells that behave like that as far as can tell) and the already_sourced trick is a bit brittle as it assumes that variable is not in the environment, not to mention that it affect the content of the BASH_SOURCE variable)

@VasyaNovikov, there seems to be something wrong with SE at the moment (or at least for me). There were only a couple of answers when I added mine, and your comment seems to have only turned up now even though it says it was posted 16 minutes ago (or maybe it's just me losing my marbles). Anyway, note the extra "exit" that is needed here to avoid problems when the size of the file increases (as noted in the comment I've added to your answer). — Stéphane Chazelas, Dec 21 '16 at 13:17
Stéphane, I think I've found another solution. It is to use }; exec true. This way, there is no requirement on newlines at end of file, which is friendly to some editors (like emacs). All tests that I could think of work correctly with }; exec true — VasyaNovikov, Nov 23 '17 at 21:30
@VasyaNovikov, not sure what you mean. How is it better than }; exit? You're also losing the exit status. — Stéphane Chazelas, Nov 23 '17 at 21:53
As mentioned at a different question: it is common to first parse the whole file and then execute the compound statement in case the dot command (. script) is used. — schily, Jun 13 '18 at 16:24
@schily, yes I mention that in this answer as a limitation of AT&T ksh and bash. Other POSIX-type shells don't have that limitation. — Stéphane Chazelas, Jun 13 '18 at 21:12
It is a bash deviation compared to genetic shells. Please do not try to judge on a type of behavior hat is in effect since 40 years. Using aliases inside shell scripts is a questionable method so this is definitely not a limitation. — schily, Jun 13 '18 at 22:23

meuh · Answer 2 · 2016-12-21T09:27:54.667

13

You simply need to delete the file (ie copy it, delete it, rename the copy back to the original name). In fact many editors can be configured to do this for you. When you edit a file and save a changed buffer to it, instead of overwriting the file it will rename the old file, create a new one, and put the new contents in the new file. Hence any running script should continue without problems.

By using a simple version control system like RCS which is readily available for vim and emacs, you get the dual advantage of having a history of your changes, and the checkout system should by default remove the current file and recreate it with the correct modes. (Beware of hard-linking such files of course).

edited Dec 21 '16 at 09:27

answered Dec 21 '16 at 08:52

meuh

51,383

"delete" isn't actually part of the process. If you want to make it properly atomic, you do a rename over the destination file -- if you have a delete step, there's a risk of having your process die after the delete but before the rename, leaving no file in place at all (or a reader try to access the file in that window, and find neither old nor new versions available). – Charles Duffy Dec 22 '16 at 17:37
@CharlesDuffy: If an open file is deleted (asynchronously), doesn't it survive until it's closed (as stated in unlink(2))? – musiphil Sep 20 '21 at 23:39
1

@musiphil, whether you unlink the file or rename a new inode to have the old name, the effect is the same -- the old inode remains valid until no references to it exist. What part of my above comment do you believe contradicts that? The race condition is around the directory entry, not the inode. – Charles Duffy Sep 21 '21 at 00:11

VasyaNovikov · Answer 3 · 2023-10-10T05:40:20.163

10

Use:

{
  ... your code ...
exit
}

Bash will read the whole {} block before executing it, and the exit directive will make sure nothing will be read outside of the code block.

For scripts that are "sourced" rather than executed, use:

{
  ... your code ...
return 2>/dev/null || exit
}

edited Oct 10 '23 at 05:40

answered Dec 21 '16 at 10:33

VasyaNovikov

1,246

1

What I found is that it doesn't see EOF and stop reading the file, but it gets tangled up in its "buffered stream" processing and ends up seeking past the end of the file, which is why it looks fine if the size of the file increases by not much, but looks bad when you make the file more than twice as big as before. I'll report a bug to the bash maintainers shortly. – Stéphane Chazelas Dec 21 '16 at 14:17
1

bug reported now, see also patch. – Stéphane Chazelas Dec 21 '16 at 14:52
Comments are not for extended discussion; this conversation has been moved to chat. – terdon Dec 22 '16 at 11:19

Jasen · Answer 4 · 2016-12-21T18:22:07.157

Proof of concept. Here's a script that modifies itself:

cat <<EOF >/tmp/scr
#!/bin/bash
sed  s/[k]ept/changed/  /tmp/scr > /tmp/scr2

# this next line overwites the on disk copy of the script
cat /tmp/scr2 > /tmp/scr
# this line ends up changed.
echo script content kept
EOF
chmod u+x /tmp/scr
/tmp/scr

we see the changed version print

This is because bash loads keeps a file handle to open to the script, so changes to the file will be seen immediately .

If you don't want to update the in-memory copy, unlink the original file and replace it.

One way to do that is by using sed -i.

sed -i '' filename

proof of concept

cat <<EOF >/tmp/scr
#!/bin/bash
sed  s/[k]ept/changed/  /tmp/scr > /tmp/scr2

# this next line unlinks the original and creates a new copy.
sed -i ''  /tmp/scr

# now overwriting it has no immediate effect
cat /tmp/scr2 > /tmp/scr
echo script content kept
EOF

chmod u+x /tmp/scr
/tmp/scr

If you are using an editor to change the script, enabling the "keep a backup copy" feature may be all that's needed to cause the editor to write the changed version to a new file instead of overwriting the existing one.

No, bash doesn't open the file with mmap(). It's just careful to read one line at a time as needed, just like when it's getting the commands from a terminal device when interactive. — Stéphane Chazelas, Dec 21 '16 at 12:54

user1133275 · Answer 5 · 2016-12-21T14:31:48.560

Wrapping your script in a block {} is likely the best option but requires changing your scripts.

F=$(mktemp) && cp test.sh $F && bash $F; rm $F;

would be the second best option (assuming tmpfs) the disadvantage is it breaks $0 if your scripts use that.

using something like F=test.sh; tail -n $(cat "$F" | wc -l) "$F" | bash is less ideal because it has to keep the whole file in memory and breaks $0.

touching the original file should be avoided so that, last modified time, read locks, and hard links are not disturbed. that way you can leave an editor open while running the file and rsync won't needlessly checksum the file for backups and hard links function as expected.

replacing the file on edit would work but is less robust because it's not enforceable to other scripts/users/or one might forget. And again it would break hard links.

anything that makes a copy would work. tac test.sh | tac | bash — Jasen, Dec 21 '16 at 08:56

How to read the whole shell script before executing it?

5 Answers5

Linked