When a shell script starts with #!
, that first line is a comment as far as the shell is concerned. However the first two characters are meaningful to another part of the system: the kernel. The two characters #!
are called a shebang. To understand the role of the shebang, you need to understand how a program is executed.
Executing a program from a file requires action from the kernel. This is done as part of the execve
system call. The kernel needs to verify the file permissions, free the resources (memory, etc.) associated to the executable file currently running in the calling process, allocate resources for the new executable file, and transfer control to the new program (and more things that I won't mention). The execve
system call replaces the code of the currently running process; there's a separate system call fork
to create a new process.
In order to do this, the kernel has to support the format of the executable file. This file has to contain machine code, organized in a way that the kernel understands. A shell script doesn't contain machine code, so it can't be executed this way.
The shebang mechanism allows the kernel to defer the task of interpreting the code to another program. When the kernel sees that the executable file begins with #!
, it reads the next few characters and interprets the first line of the file (minus the leading #!
and optional space) as a path to another file (plus arguments, which I won't discuss here). When the kernel is told to execute the file /my/script
, and it sees that the file begins with the line #!/some/interpreter
, the kernel executes /some/interpreter
with the argument /my/script
. It's then up to /some/interpreter
to decide that /my/script
is a script file that it should execute.
What if a file neither contains native code in a format that the kernel understands, and does not start with a shebang? Well, then the file isn't executable, and the execve
system call fails with the error code ENOEXEC
(Executable format error).
This could be the end of the story, but most shells implement a fallback feature. If the kernel returns ENOEXEC
, the shell looks at the content of the file and checks whether it looks like a shell script. If the shell thinks the file looks like a shell script, it executes it by itself. The details of how it does this depends on the shell. You can see some of what's happening by adding ps $$
in your script, and more by watching the process with strace -p1234 -f -eprocess
where 1234 is the PID of the shell.
In bash, this fallback mechanism is implemented by calling fork
but not execve
. The child bash process clears its internal state by itself and opens the new script file to run it. Therefore the process that runs the script is still using the original bash code image and the original command line arguments passed when you invoked bash originally. ATT ksh behaves in the same way.
% bash --norc
bash-4.3$ ./foo.sh
PID TTY STAT TIME COMMAND
21913 pts/2 S+ 0:00 bash --norc
Dash, in contrast, reacts to ENOEXEC
by calling /bin/sh
with the path to the script passed as an argument. In other words, when you execute a shebangless script from dash, it behaves as if the script had a shebang line with #!/bin/sh
. Mksh and zsh behave in the same way.
% dash
$ ./foo.sh
PID TTY STAT TIME COMMAND
21427 pts/2 S+ 0:00 /bin/sh ./foo.sh