Neither execve
nor the kernel code do call the _start
function (the entry point of an executable, whatever it's called), ever.
That's because they're running in different contexts; think as if they were running on different machines.
What happens is that the kernel arranges for the execve
system call, upon returning to user mode, to have the IP
(instruction pointer) register set to point to the beginning of the _start
function, and the SP
(stack pointer) register set to point to the beginning of the argv + env string list, so the effect from the point of view of user mode is as if someone had called the _start
function as:
_start(argc, argv0, argv1, ... , NULL, env0, env1, ... NULL)
in a calling convention where all arguments are passed on the stack.
Of course, before that, the kernel had taken care of copying those argv + env at the right place, mapping the segment containing the _start
function, etc.
Notice that the argv + env strings are all packed together in a single chunk, eg.
"prog\0arg1\0arg2\0VAR1=foo\0VAR2=bar\0"
The virtual addresses where that chunk begins and ends are accessible via the /proc/PID/stat
file; quoting from the procfs(5)
manpage:
(48) arg_start %lu (since Linux 3.5) [PT]
Address above which program command-line arguments
(argv) are placed.
(49) arg_end %lu (since Linux 3.5) [PT]
Address below program command-line arguments (argv)
are placed.
Writing to that address will modify whatever appears in the ps
output:
$ sleep 3600 3600 3600 3600 3600 3600 3600 &
[2] 4927
$ awk '{print $48,$49,$49-$48-1}' /proc/4927/stat
140735402952841 140735402952882 40
$ printf 'Somebody set up us the bomb Main screen turn on\0' | dd bs=1 count=40 of
=/proc/4927/mem seek=140735402952841 conv=notrunc
40+0 records in
40+0 records out
40 bytes copied, 0.000229779 s, 174 kB/s
$ ps 4927
PID TTY STAT TIME COMMAND
4927 pts/4 S 0:00 Somebody set up us the bomb Main screen
execve()
has no idea what you are speaking about. That a C language program begins execution (as far as the programmer is concerned) with a call tomain()
is a feature of the C language. Other languages have other conventions. It's the job of the linker to arrange it so that the startup code for the C language runtime callsmain()
. Allexecve()
does is load the image and start running the process at the actual entry point of the executable, as specified by the linker which created the executable image. – AlexP Feb 02 '19 at 17:12retval = exec_binprm(bprm);
(line 1819). – AlexP Feb 02 '19 at 17:18execve()
,exec_binprm
is a very high level function, which loads the executable file and dynamic linker among other things. That doesn't answer my question of how execve() then calls the startup routine crt0 and then main() of the program from the executable file. – Tim Feb 02 '19 at 19:08e_entry
in the ELF header. – AlexP Feb 02 '19 at 20:00e_entry
store the address ofmain()
of the executable, or the startup routinecrt0
? – Tim Feb 02 '19 at 21:43e_entry
to the address of the first machine instruction to be executed. Where this machine instruction comes from depends on the programming language and runtime library. For programs written in a higher-level language, it is an entry point in the runtime library which is responsible for setting things up and calling the main program according to the conventions of that language. I confess that I have never been curious to find out what's the name of the library routine which callsmain()
for C language programs. – AlexP Feb 02 '19 at 23:33__libc_start_main()
. You can override it from a preloaded library and exec another binary instead. But that's not the entry point -- it's itself called from the_start
function, which is the default entry point. – Feb 03 '19 at 05:55