5

With a program like

int main()
{
   return 0;
}
  • and you statically link, will some library on your system be linked into the final binary.
  • and you dynamically link, will a library be loaded when it's run?

In essence, is a library always required for even the simplest programs, if so why? I ask because I thought the canonical entry point for anything that wants to be executed is actually _start (which I thought was in a library, namely glibc). Maybe I don't understand what _start really does with regard to setting things up, so any pointers there would be helpful too.

  • 2
    Existence proof: the Linux kernel is a C program that does not load a library. – Jörg W Mittag Dec 14 '19 at 07:53
  • 1
    Note that a static library is in principle just a collection of object files. Instead of linking with a library (and leaving it to the linker to extract the functions you need from the library) you can simply link against the object files directly. Since object files are simply compiled or assembled source files you can also simply first compile (or assemble) the files yourself if you have the sources. You don't need any binaries. (I'm not perfectly sure whether that's different with the runtime lib because interacts low-level with the operating system -- I'm curious to hear comments). – Peter - Reinstate Monica Dec 14 '19 at 11:44

1 Answers1

11

If you want to write your program in standard portable C, you need of course some runtime that calls the main() function in the first place.

But if you don't care about that, you can dispense with any library, and do system calls directly via inline assembly. Eg. for x86-64:

$ cat q.c
#include <sys/syscall.h>
void _start(void){
        __asm__( "syscall" : : "D"(0), "a"(SYS_exit) );
}
$ cc -O2 -static -nostdlib -nostartfiles -Wall q.c -o q
$ strace ./q
execve("./q", ["./q"], 0x7fffc72d8d20 /* 39 vars */) = 0
exit(0)                                 = ?
+++ exited with 0 +++

You have to do at least one system call, namely _exit(2), unless exit-by-crashing is acceptable for a "simplest program", in which case an empty file will do, too ;-):

$ > foo.c
$ cc -static -nostdlib -nostartfiles -Wall foo.c -o ./foo
/usr/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000
$ ./foo
Segmentation fault

I thought the canonical entry point for anything that wants to be executed is actually _start

there's nothing canonical about it; _start is the default name the linker will use; you can point it elsewhere with the -e sym option (-Wl,-e,sym with gcc).

  • That's very useful information, I want to give the correct answer, but could you clarify on my question a little: I'm guessing from your answer there's a runtime.so or .a or something like that which gets linked and/or loaded. – Andrew Cina Dec 13 '19 at 23:18
  • 2
    Yes, it is. At least the crt*.o objects will have to be linked in, because they contain the code which ultimately calls main (via a __libc_start_main() wrapper on Linux) –  Dec 13 '19 at 23:32
  • Awesome clear answer. The in line assembly and -nostdlib -nostartfiles were really nice examples to help understand too. – Andrew Cina Dec 13 '19 at 23:39
  • 3
  • @mosvy Why are the crt*.o files object files and not .a or .so files. I thought, as they typically come from libc, that they would be compiled into that format as other lib files seem to be. – Andrew Cina Dec 14 '19 at 00:05
  • 1
    The *.a files are archives which contain *.o files. You can see their content with the ar tf.They could've put those too in an archive, but they just didn't. As to the *.so files, they're completely different, they're for dynamically linked executables: no part of their content is stored in the executable, but only a reference to them, so that the dynamic linker (elf interpreter) can load them at run time. –  Dec 14 '19 at 00:15
  • @AndrewCina you can also read here about why that crt* code is necessary instead of the kernel just using main as the entry point, even for statically linked executables (basically: unpacking of the argc + argv and calling static constructors + destructors for c++ code) –  Dec 14 '19 at 00:36
  • I think trying to link the OP's main.o into an executable with -nostdlib or equivalent will reveal which symbols would be linked from e.g. glibc. – Peter - Reinstate Monica Dec 14 '19 at 11:48
  • @Peter no symbols will be linked in that case; the linker will set the entry point to some default value, and the "program" will work just like my 2nd example. But if you compile it with -nodefaultlibs (which will link in the crt*.o ini + fini files from libc but not the rest of the library) you will see that the crt*.o code calls through the __libc_start_main and __libc_csu_init hooks, but you can override those yourself, via either statically or dynamically linked code. –  Dec 14 '19 at 13:00
  • @Peter that is a feature of glibc, also available in musl and uclibc (the latter with a different name __uClibc_main, if they haven't changed it recently). On other systems, the _start glue code will call directly into main(). –  Dec 14 '19 at 13:13
  • @mosvy I found that you can actually have an entry-point at main and have it execute successfully so long as a call to exit is made. – Andrew Cina Dec 18 '19 at 22:04
  • @AndrewCina that's exactly what I've told you: the name doesn't matter. You can set the entry to whatever you like, eg change _start to main and compile it with -Wl,-e,main. –  Dec 18 '19 at 22:56
  • @mosvy, yes you did. I confused myself regarding "portable c" in your first comment. Does having to use -e if main should be your entry point make it not "portable"? Also, if crt.o files are merely object files, even though they are generated from libc, can't you argue they are not library files? Finally, if crt.o is linked against your program, it must always be statically linked, no? And would the answer to my original question, specifically regarding dynamically linking, mean that no other libraries are actually loaded in that example? – Andrew Cina Dec 18 '19 at 23:53
  • @AndrewCina yes. just assuming anything about how main is called and how its return value is handled is completely unportable. Even the modern Unices are very different (eg. the __libc_start_main trick doesn't work on *BSD or Solaris). –  Dec 18 '19 at 23:59
  • 1
    *.o files can only be linked statically. Compiling a simple standard C example like int main(){} on Linux will statically link in the crt*.o files, and either link dynamically to the standard C library (libc.so.*) or pull in quite a lot of stuff from it (just compile it with cc -static -xc - <<<'int main(){}' -o a.out and analyze the executable with objdump -xd a.out). FWIW, even a dynamically linked executable will statically link in stuff from the elf-init.o file from libc.a. –  Dec 19 '19 at 00:22