What happens when I give the shell a "command" versus a script?

Question

A variant of this question has been asked a few times (here and here, for instance) but I'm afraid that the answers either didn't fully capture my question here or else perhaps assumed more than I know.

I will ask my question by example but, roughly, what I am trying to understand is (1) how the shell discerns between executables and scripts and, if it does, (2) if there are any differences in terms of what happens next when it does.

Suppose in my working directory is a shell script script and an executable (I am taking "exeecutable" to mean "machine code" in binary, but perhaps that's not right) exe. Let's suppose I'm interacting with the bash shell, and let's suppose script is such that both bash and tcsh can run it. Further, assume that a priori the first line does not begin with a shebang #!....

Suppose I enter script at the command line prompt. How does bash figure out that this is a script and not an executable, and what does it do once it figures it out? (I think the answer is "create a new process which is, in particular, a new shell (i.e. a subshell) which is bash in this case (because I have no shebang) and then execute the commands in the script therein", but I am not sure.)
Now suppose I enter exe at the command line prompt. How does bash figure out that this is an executable and not a script, and what does it do once it figures it out? (I think the answer is "create a new process and wait for the executable to finish", but I am not sure.)
Now suppose I modify script to include #! /bin/tcsh in the first line. How does bash figure out that this is a script (does the shebang change anything from the answer to (1)?) and not an executable, and what does it do once it figures it out? (I think the answer is "create a new process which is, in particular, a new shell (i.e. a subshell) which is tcsh in this case (because I the shebang) and then execute the commands in the script therein", but I am not sure.)

Stéphane Chazelas · Accepted Answer · 2024-02-02T21:05:24.610

Before all, it's the system that executes commands.

With this shell code:

cmd and its args

The shell looks up cmd in aliases, functions, builtins and executable files (regular files which you have permission to execute) in directories listed in $PATH (not the current working directory unless it happens to be in $PATH which would be bad practice).

If the latter, it calls the execve() system call usually in a child process with 3 arguments:

The path to the the file (/path/to/cmd)
A list of arguments (["cmd", "and", "its", "args", 0])
A list of var=value strings for each of the exported variables.

If execve() would handle the file based on its type:

if a ELF binary executable, it will load/map it (or sections thereof) in memory along with possibly a dynamic linker which will take care loading more shared libraries... and start running it.
if it starts with #!, it (not the shell) will interpret the rest of the line as another file to execute and will pass the path of the file as an extra argument to that command. (if it's #! foo bar, it will do the equivalent of execve("foo", ["foo" or "cmd", "bar", "/path/to/cmd", "and", "its", "args"], env)).
systems may support a number of different native executable file formats instead of in addition to ELF, and some can also be configured to associate interpreters to patterns matching the start of the file (see binfmt_misc on Linux).

The process memory will have been mostly wiped in the process and execve() never returns as the process is now running the code in the executable.

If the file format is not recognised execve() will return -1 (indicating failure) with ENOEXEC as the error code.

In that case, POSIX shells, but also things like the execlp() C function (not syscall) or env/find -exec... (or generally all things that are meant to execute commands in the POSIX toolchest) are required to treat that as a sh script (though possibly after having checked it looks like one using some heuristics to avoid running sh on some random stuff).

Most shells do that by executing sh on it as if there was a #! /path/to/the/standard/sh - shebang. Some which are POSIX compliant sh implementations or have a POSIX sh mode do it by interpreting the files themselves in a child process.

So for your three cases:

shebang-less script, run with ./script rather than script which would likely run /usr/bin/script instead: the shell runs execve("./script", ["./script", 0], env) in a child process, and execve() fails with ENOEXEC because it's not in a known format, so the shell will (optionally) look at it, see it looks like it could be in sh syntax and run execve("/bin/sh", ["sh", "-", "./script"]) (with variations on the argv[0], or -) or interpret it itself (and the parent shell process waits for its termination).
./exe: the shell does execve("./exe", ["./exe", 0], env) in a child process which succeeds and the parent waits for its termination.
./script with a #! /bin/tcsh shebang: the shell does execve("./script", ["./script", 0], env) which also succeeds as the system recognises it as a script. execve() will in turn run /bin/tcsh with ./script as argument. And the parent shell waits for its child as usual.

Thank you so much for this answer (and for your comments on my other question). A couple follow-ups if possible: 1) To confirm, the shell looks up the command in precisely the order you gave in the first paragraph, right? Perhaps I don't understand the difference between child process and subshell though 2) You mention that the execution of the command occurs "usually in a child process". Does that mean that the shell first asks the kernel to create a subshell, whereafter the parent shell passes the requisite info to said subshell which then calls execve() with the info you mentioned? — EE18, Feb 04 '24 at 21:43
@EE18, first, be aware that Stéphane is quite an expert, and he will use precise terminology. "Subshell" doesn't apply when it's the operating system creating processes to launch a program. "Subshell" is a shell-specific term: for bash see Command Execution Environment — glenn jackman, Feb 05 '24 at 04:14
@EE18 about order in your 1), that's a bit of an approximation, alias and keywords are handled at an early stage of syntax parsing, it's probably best not to call them command lookup. In some shells, special builtins are looked up before functions... Like Glenn says, subshell is a functional shell concept. Forking a child process is how shells can (and often do but don't have to) implement it. But how shells fork processes to do what they have to do is a detail of implementation, it's too much of an approximation to conflate subshell and child process. — Stéphane Chazelas, Feb 05 '24 at 07:03
Understood, mostly. Thank you very much to both of you, @glennjackman and Stephane! — EE18, Feb 05 '24 at 15:10

Kaz · Answer 2 · 2024-02-02T21:00:43.010

When you give the shell a command like

alpha beta gamma

after the command line is put through all the expansions, the shell has to determine what kind of command alpha is. If it is not an alias or defined function, it has to be an external command. In that case, it will search for that program in the directories indicated by PATH and if it finds it, it will execute it using the operating system (fork + exec).

A script doesn't have to begin with a #! line. If a file is executable, and doesn't have a header indicating it to be of a particular type, it will be processed with the shell.

How that works in GNU/Linux is that the execve system call fails on that file: it has no recognized format. The library transparently re-tries another execve, using the shell. Here is a strace snapshot of that happening:

32056 execve("./command", ["./command"], 0xbfb71504 /* 27 vars */) = -1 ENOEXEC (Exec format error)
32056 execve("/bin/sh", ["/bin/sh", "./command"], 0xbfb71504 /* 27 vars */) = 0

Here, ./command is a file that contains echo foo as its one and only line. It has execute permissions.

The program being traced made what it thinks is a single call to the execvp function. That function internally invokes the execve kernel system call. The first call to it fails to execute the script directly, so execvp tries again, this time with /bin/sh executable as argument zero.

This is actually not just a Linux thing; it is required by POSIX, which says:

One common historical implementation is that the execl(), execv(), execle(), and execve() functions return an [ENOEXEC] error for any file not recognizable as executable, including a shell script. When the execlp() and execvp() functions encounter such a file, they assume the file to be a shell script and invoke a known command interpreter to interpret such files. This is now required by POSIX.

Dispatch of shell scripts that don't have a #! line is thus transparent to applications which use the p-suffixed (PATH-searching) exec functions. Your shell is either doing that, or else implements similarg logic itself. (That is to say, it's possible the shell is implementing its own PATH search and then using execve, and recovering similarly from its failure.)

What happens when I give the shell a "command" versus a script?

2 Answers2

Linked