2

How do functions like getenv(3) access the environment when my program doesn't have any references to the environment?

Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232
  • I was about to tell you "system call", but then I looked it up and found this: https://stackoverflow.com/questions/46623018/does-a-program-make-a-system-call-to-get-the-value-of-an-environment-variable-in – Philip Couling Jan 26 '19 at 22:55
  • 1
    @couling that stackoverflow answer is wrong; the 3rd argument to main(), (unlike the environ global sym) is non-standard –  Jan 26 '19 at 23:36
  • @mosvy careful with the phrase "non-standard". There are many standards, maybe specify which standards it's not defined in? – Philip Couling Jan 26 '19 at 23:47
  • @couling POSIX (extra junk here for the stupid min length) –  Jan 26 '19 at 23:50
  • 1
    @mosvy now that I read your other comments and re-read the answer it's NOT wrong. Of course, main is just something invented by your compiler's runtime support libraries. As far as the OS is concerned, the interface is somewhat different. The same principle still stands, however. The environment is passed to the newly created program after execve on its stack. – Philip Couling Jan 27 '19 at 01:05
  • @couling it's still wrong; that's not how they're passed to the program's entry point (as pointers to 2 NULL-terminated lists), but as separate pointers to strings (ie. argv0, argv1, argv2, ... NULL, env0, env1, ... NULL); I'll try to edit the answer into something less misleading, in the meanwhile you can have a look here –  Jan 27 '19 at 01:27
  • @mosvy the other answer doesn't mention any structure for the way it's passed to the program entry point. In fact it explicitly avoids infering this with the phrase the interface is somewhat different. – Philip Couling Jan 27 '19 at 11:20
  • Don’t forget that the environment is inherented. Change it in a shell and subsequent commands receive it. If a program changes what is in the environment data, it can pass it to child processes with those changes. – Roger L. Feb 04 '19 at 00:34

1 Answers1

13

Your program doesn't have a reference to the environment, but a whole copy of it.

The command line arguments and environment strings (as they were passed to the execve(2) system call) are all packed together and copied in the address space of the process [1].

In a typical implementation [2], two NULL-terminated lists of pointers to them (representing the argument lists and the environment) are made available on the stack to the entry point of the program (_start), where the startup code (run before main()) will point the char **environ global variable to the beginning of the latter.

The getenv(3) function is simply looking through that environ list and comparing each entry in turn.

When some new entry has to be added to the environment (as with setenv(3)), the environ list will be relocated elsewhere.

[1] On Linux, the addresses of the argument list and environment variables are accessible as the 48th and 50th fields of /proc/PID/stat, see procfs(5).

[2] In glibc, _start will pop argc, point argv to the top of the stack, and __libc_start_main will set __environ (an alias for environ) to argv + argc + 1.

ilkkachu
  • 138,973
  • 1
    Could you be clearer with this on this point: by "known location" do you mean static memory address? If not, how does the OS tell the C library (or similar) what this address is? – Philip Couling Jan 26 '19 at 23:51
  • 1
    On linux at least, they're put at the end of stack, so they're accessible via registers in the program's entry point (_start). You can get the actual addresses from 48th-51th fields (argv_start-env_end) from the /proc/PID/stat file, see procfs(5). –  Jan 27 '19 at 00:41
  • 1
    Could you edit your answer to include this. Putting a pointer on the stack prior to starting the program is substantially different to a "known location". – Philip Couling Jan 27 '19 at 01:01
  • I'd appreciate you could consider https://unix.stackexchange.com/questions/498425/what-are-differences-between-do-execve-and-start-copying-the-command-line – Tim Feb 03 '19 at 15:00