70

https://www.centos.org/docs/5/html/5.2/Deployment_Guide/s3-proc-self.html says

The /proc/self/ directory is a link to the currently running process.

There are always multiple processes running concurrently, so which process is "the currently running process"?

Does "the currently running process" have anything to do with which process is currently running on the CPU, considering context switching?

Does "the currently running process" have nothing to do with foreground and background processes?

Tim
  • 101,790

6 Answers6

84

This has nothing to do with foreground and background processes; it only has to do with the currently running process. When the kernel has to answer the question “What does /proc/self point to?”, it simply picks the currently-scheduled pid, i.e. the currently running process (on the current logical CPU). The effect is that /proc/self always points to the asking program's pid; if you run

ls -l /proc/self

you'll see ls's pid, if you write code which uses /proc/self that code will see its own pid, etc.

Stephen Kitt
  • 434,908
  • 20
    This is "accurate" in a sense, but not meaningful to someone who doesn't understand the kernel's concept of "current". A better answer would be that it's the process making the system call with /proc/self as part of the pathname in one of its arguments. – R.. GitHub STOP HELPING ICE Dec 29 '16 at 20:57
  • 1
    @R.. that's what ilkkachu's answer highlights, feel free to upvote that one — I did. – Stephen Kitt Dec 30 '16 at 11:47
  • If self means the current process that is scheduled on the logical CPU, why aren't there multiple self entries on a multi-core system? – Darkov Mar 21 '20 at 18:56
  • 3
    @Darkov in the same way that there’s only one word for “self” in English even though you and I have distinct selves which exist simultaneously. The kernel always knows which process is asking about /proc/self. – Stephen Kitt Mar 22 '20 at 11:57
  • Yeah, I was surprised that ls runs in a separate process. Sounds expensive. Is bash's philosophy is to run every command in a separate process? – Ivan_a_bit_Ukrainivan Jul 01 '22 at 09:34
  • 1
    @Ivan_Bereziuk it’s not bash’s philosophy, it’s the philosophy of pretty much all operating systems with multiple processes (including single-tasking OSs like DOS): a process which wants to run another program, and regain control once the second program has finished, needs to run it in a separate process. Process creation is cheap on Linux. – Stephen Kitt Jul 01 '22 at 10:08
44

The one that accesses the symlink (calls readlink() on it, or open() on a path through it). It would be running on the CPU at the time, but that's not relevant. A multiprocessor system could have several processes on the CPU simultaneously.

Foreground and background processes are mostly a shell construct, and there's no unique foreground process either, since all shell sessions on the system will have one.

ilkkachu
  • 138,973
37

The wording could have been better but then again any wording you try to compose to express the idea of self reference is going to be confusing. The name of the directory is more descriptive in my opinion.

Basically, /proc/self/ represents the process that's reading /proc/self/. So if you try to open /proc/self/ from a C program then it represents that program. If you try to do it from the shell then it is the shell etc.

But what if you have a quad core CPU capable of running 4 processes simultaneously, for real, not multitasking?

Then each process will see a different /proc/self/ for real without being able to see each other's /proc/self/.

How does this work?

Well, /proc/self/ is not really a folder. It is a device driver that happens to expose itself as a folder if you try to access it. This is because it implements the API necessary for folders. The /proc/self/ directory is not the only thing that does this. Consider shared folders mounted from remote servers or mounting USB thumbdrives or dropbox. They all work by implementing the same set of APIs that make them behave like folders.

When a process tries to access /proc/self/ the device driver will generate its contents dynamically by reading data from that process. So the files in /proc/self/ does not really exist. It's kind of like a mirror that reflects back on the process that tries to look at it.

Is it really a device driver? You sound like you're oversimplifying things!

Yes, it really is. If you want to be pedantic it's a kernel module. But if you check out usenet postings on the various Linux developers channels most kernel developers use "device driver" and "kernel module" interchangeably. I used to write device drivers, err... kernel modules, for Linux. If you want to write your own interface in /proc/, say for example you want a /proc/unix.stackexchange/ filesystem that returns posts from this website you can read about how to do it in the venerable "Linux Device Drivers" book published by O'Reilly. It's even available as softcopy online.

slebetman
  • 687
  • 8
    /proc/self is not a device driver, but is instead part of a a kernel-exposed filesystem called procfs. – Chris Down Dec 28 '16 at 14:58
  • 1
    @ChrisDown: Yes but it's implemented as a kernel module - which is linux's version of device driver - there's even an example implementation of a /proc based driver in the venerable book "Linux Device Drivers". I should know, I implemented one in college. I probably could have used the term "kernel module" instead but "device driver" is what most people are familiar with and I don't want to give the misleading impression that there's a significant difference between "kernel module" and "device driver" apart from terminology. – slebetman Dec 28 '16 at 15:52
  • 9
    @slebetman well, procfs isn't a module per se, it can only be built in, never built as a module. If you want to split hairs, the hair to split is that it's a filesystem driver, not a device driver – hobbs Dec 29 '16 at 05:25
  • Surely /proc/self is a symlink, not a folder? It's nothing more than a “virtual” relative symlink that points to the pid of whichever process accesses it, or rather, to whichever process is currently scheduled on the logical core from which it's accessed which comes down to the same. – Zorf Nov 10 '23 at 18:35
  • @Zorf It's not a symlink. A symlink is something completely different. Instead it is a kernel module that implements all the interfaces for folders. In fact it cannot be implemented as a symlink because symlinks cannot point to more than one thing. On the other hand /proc/self point to the data of the current process. And there are multiple "current" process running in parallel (unless you're using an old single core CPU) – slebetman Nov 11 '23 at 13:33
  • That's why I said it's a “virtual” special magical symlink. As far as the filesystem is concerned, it definitely appears a symlink and “readlink” can be called on it and “ls -ld /proc/self” shows it as a symlink. It's a magical symlink that to the calling process always appears to contain the relative decimal indication of the process of pid of the current process, but whatever process accesses the path “/proc/self” sees a symlink. – Zorf Nov 12 '23 at 10:52
14

It's whichever process happens to be accessing /proc/self or the files/folders therein.

Try cat /proc/self/cmdline. You will get, surprise surprise, cat /proc/self/cmdline, (actually, instead of a space there will be a null character between the t and the /) because it will be the cat process accessing this pseudofile.

When you do an ls -l /proc/self, you will see the pid of the ls process itself. Or how about ls -l /proc/self/exe; it will point to the ls executable.

Or try this, for a change:

$ cp /proc/self/cmdline /tmp/cmd
$ hexdump -C /tmp/cmd
00000000  63 70 00 2f 70 72 6f 63  2f 73 65 6c 66 2f 63 6d  |cp./proc/self/cm|
00000010  64 6c 69 6e 65 00 2f 74  6d 70 2f 63 6d 64 00     |dline./tmp/cmd.|
0000001f

or even

$ hexdump -C /proc/self/cmdline 
00000000  68 65 78 64 75 6d 70 00  2d 43 00 2f 70 72 6f 63  |hexdump.-C./proc|
00000010  2f 73 65 6c 66 2f 63 6d  64 6c 69 6e 65 00        |/self/cmdline.|
0000001e

As I said, it is whichever process happens to be accessing /proc/self or the files/folders therein.

3

/proc/self is syntactic sugar. It's a shortcut to contatenating /proc/ and the result of the getpid() syscall (accessible in bash as the metavariable $$). It can get confusing, tho, in the case of shell scripting, as many of the statements invoke other processes, complete with the own PIDs... PIDs that refer to, more often than not, dead processes. Consider:

root@vps01:~# ls -l /proc/self/fd
total 0
lrwx------ 1 root root 64 Jan  1 01:51 0 -> /dev/pts/0
lrwx------ 1 root root 64 Jan  1 01:51 1 -> /dev/pts/0
lrwx------ 1 root root 64 Jan  1 01:51 2 -> /dev/pts/0
lr-x------ 1 root root 64 Jan  1 01:51 3 -> /proc/26562/fd
root@vps01:~# echo $$
593

'/bin/ls' will evaluate the path to the directory, resolving it as /proc/26563, since that's the PID of the process - the newly created /bin/ls process - that reads the contents of the directory. But by the time the next process in the pipeline, in the case of shell scripting, or by the time the prompt comes back, in the case of an interactive shell, the path no longer exists and the information output refers to a nonexistent process.

This only applies to external commands, however (ones that are actual executable program files, as opposed to being built into the shell itself). So, you'll get different results if you, say, use filename globbing to obtain a list of the contents of the directory, rather than passing the path name to the external process /bin/ls:

root@vps01:~# ls /proc/self/fd
0  1  2  3
root@vps01:~/specs# echo /proc/self/fd/*
/proc/self/fd/0 /proc/self/fd/1 /proc/self/fd/2 /proc/self/fd/255 /proc/self/fd/3

In the first line, the shell spawned a new process, '/bin/ls', via the exec() syscall, passing "/proc/self/fd" as argv[1]. '/bin/ls', in turn, opened the directory /proc/self/fd and read, then printed, its contents as it iterated over them.

The second line, however, uses glob() behind the scenes to expand the list of filenames; these are passed as an array of strings to echo. (Usually implemented as an internal command, but there's often also a /bin/echo binary... but that part's actually irrelevant, since echo is only dealing with strings it never feeds to any syscall related to path names.)

Now, consider the following case:

root@vps01:~# cd /proc/self/fd
root@vps01:~# ls
0  1  2  255

Here, the shell, the parent process of /bin/ls, has made a subdirectory of /proc/self its current directory. Thus, relative pathnames are evaluated from its perspective. My best guess is that this is related to the POSIX file semantics where you can create multiple hard links to a file, including any open file descriptors. So this time, /bin/ls behaves similarly to echo /proc/$$/fd/*.

-2

As the shell invokes programs like ls in separate processes, /proc/self will show up as a symlink to nnnnn, where nnnnn is the process ID of the ls process. As far as I know, commonly used shells have no builtin for reading symlinks, but Perl has:

perl -e 'print "/proc/self link: ",readlink("/proc/self")," - pid $$\n";'

So /proc/self behaves as a symlink, but the procfs filesystem makes it "magically" process-aware.

LHP
  • 1