0

Computer Systems: A Programmer's Perspective (3ed) says on p733 that

7.9 Loading Executable Object Files

To run an executable object file prog, we can type its name to the Linux shell’s command line:

linux> ./prog

Since prog does not correspond to a built-in shell command, the shell assumes that prog is an executable object file, which it runs for us by invoking some memory-resident operating system code known as the loader. Any Linux program can invoke the loader by calling the execve function, which we will describe in detail in Section 8.4.6

and in p736: during dynamic linking

7.10 Dynamic Linking with Shared Libraries

Once we have created the library, we would then link it into our example program in Figure 7.7:

linux> gcc -o prog2l main2.c ./libvector.so

This creates an executable object file prog2l in a form that can be linked with libvector.so at run time. The basic idea is to do some of the linking statically when the executable file is created, and then complete the linking process dynamically when the program is loaded. It is important to realize that none of the code or data sections from libvector.so are actually copied into the executable prog2l at this point. Instead, the linker copies some relocation and symbol table information that will allow references to code and data in libvector.so to be resolved at load time.

When the loader loads and runs the executable prog2l, it loads the partially linked executable prog2l, using the techniques discussed in Section 7.9. Next, it notices that prog2l contains a .interp section, which contains the path name of the dynamic linker, which is itself a shared object (e.g., ld-linux.so on Linux systems). Instead of passing control to the application, as it would normally do, the loader loads and runs the dynamic linker. The dynamic linker then finishes the linking task by performing the following relocations:

  • Relocating the text and data of libc.so into some memory segment
  • Relocating the text and data of libvector.so into another memory segment
  • Relocating any references in prog2l to symbols defined by libc.so and libvector.so

The above dynamic linking case is the "static loading, dynamic linking" case in Stephen Kitt's reply:

static loading, dynamic linking: the linker is /usr/bin/ld again, but with shared libraries (.so); the loader is the binary’s interpreter, e.g. /lib64/ld-linux-x86-64.so.2 (which maps to /lib/x86_64-linux-gnu/ld-2.24.so currently) on Debian 9 on 64-bit x86, itself loaded by the kernel, which also loads the main executable;

The difference is that CSAPP seems to say that the loader is (the kernel code behind) execve() and the linker is ld-linux.so (no linking happens at compile time by ld, and actual linking happens at load time by ld-linux.so).

What is the linker and what is the loader in dynamic linking?

Thanks.

alexh
  • 109
  • 3
Tim
  • 101,790
  • Why was this downvoted 4 times? The only issue I see is that CSAPP wasn't properly introduced as an acronym. Otherwise, this seems like a totally valid question for this SE site. – the_endian Nov 23 '22 at 20:29

2 Answers2

1

The highlighting here:

The dynamic linker then finishes the linking task

misses the important word, “finishes”. The linker, ld, starts the linking task, performing as much as possible at build-time and preparing the data structures required to finish it.

The loader can then finish the linking task as it loads the program and the libraries it needs: matching up symbols, and performing the necessary relocation.

In CSAPP’s terminology, the dynamic loader is the in-kernel ELF loader, and the dynamic linker is ld-linux.so.

The GNU C library’s own documentation refers to ld.so as the dynamic linker/loader. ld.so (or ld-linux.so) performs quite a lot of loading itself, as well as linking, so applying both terms to it is accurate — the kernel loads the executable itself and its interpreter (the dynamic linker/loader), and the interpreter loads all the other required libraries.

See How programs get run: ELF binaries for details of how this all works on Linux systems.

Stephen Kitt
  • 434,908
  • Does the manpage of ld.so say that ld.so is both the loader and linker of a shared library, during dynamic linking? – Tim Sep 27 '20 at 23:35
  • Does your reply https://unix.stackexchange.com/a/476783/674 need some change? – Tim Sep 27 '20 at 23:42
  • ld.so and ld-linux.so are the same thing. The manpage says that ld.so is both the loader and linker of dynamically-linked executables. I don’t think my existing answer needs to be changed just because CSAPP uses different terminology. – Stephen Kitt Sep 28 '20 at 04:37
  • "In CSAPP’s terminology, the dynamic loader is the in-kernel ELF loader" (I think you meant static loader by "dynamic loader" https://unix.stackexchange.com/a/476783), and "The GNU C library’s own documentation refers to ld.so as the dynamic linker/loader." In dynamic linking (with static loading), is the loader both the in-kernel ELF loader and ld.so? Are they the same thing? What is the in-kernel ELF loader? – Tim Sep 28 '20 at 09:48
  • Does execve() perform static loading of shared libraries (needed by the executable which is its first argument), by invoking ld-linux.so? – Tim Sep 28 '20 at 09:56
  • Yes, static loader in Jeff Darcy’s classification, sorry (it gets confusing for everyone). The kernel loader is binfmt_elf.c in Linux. It can load statically-linked binaries on its own, but for dynamically-linked binaries it needs ld-linux.so’s help. execve doesn’t perform any static loading of libraries, it delegates all that to the dynamic linker. – Stephen Kitt Sep 28 '20 at 11:44
1

The dynamic linker is call ld.so. You can find the configuration under /etc/ld.so*. Most of the configuration has to do with paths where .so are to be searched.

ld has to make sure that all the functions exist in the shared libraries before finishing up the linking of an executable (well... technically, it is not required to do so, but it does—that being said, if you copy an executable from one computer to another, the .so could be different and have new functions and have lost old ones and the binary won't run).

When the linker (ld) creates a binary which expects a shared library, it keeps a certain number of symbols and corresponding addresses in your .text section. That's what the dynamic linker (ld.so) uses to finish up the linking at run time. It will search the symbols in the corresponding .so files and save their addresses wherever required in your code (for functions, it is often a table of jump instructions so that way they can link each function only once).

And of course, when you strip a binary, those special symbols do not get removed.

To see the list of libraries that will be loaded, you can run ldd against your executable. Especially, it will show you which .so is selected to resolve the symbols (full path). You can change the search path with the LD_LIBRARY_PATH environment variable. That allows you to test against different or uninstalled .so files (cmake, automake do use the --rpath as well, which is the same, only the path is saved directly in the binaries).

Which part exactly loads the files and kickstarts them, I'm not exactly sure. The execve() is probably not directly implementing all of that logic. But it is certainly close to the lowest level function that knows how to run an executable.

So in effect, the dynamic linking is an extremely simple process compared to the standard linker:

  1. load the .so
  2. search the .so for symbols
  3. save symbol address

Done. This is why it's so fast.

Note: Further dynamism can be achieved using the dlopen(), but it sounds like you're not talking about that part.

Alexis Wilke
  • 2,857
  • 1
    If I am correct, ldd is to report what shared libraries a partially linked executable file needs. It is not a linker. – Tim Sep 27 '20 at 23:36
  • ldd uses the dynamic linker, ld.so (or ld-linux.so), but it isn’t itself the dynamic linker. – Stephen Kitt Sep 28 '20 at 05:45
  • @Tim & Stephen, I've edited my post with the correction. I've applied a few other corrections too. – Alexis Wilke Sep 28 '20 at 19:13