While the Stack Overflow question seemed to be enough at first, I understand, from your comments, why you may still have a doubt about this. To me, this is exactly the kind of critical situation involved when the two UNIX subsystems (processes and files) communicate.
As you may know, UNIX systems are usually divided into two subsystems: the file subsystem, and the process subsystem. Now, unless it is instructed otherwise through a system call, the kernel should not have these two subsystems interact with one another. There is however one exception: the loading of an executable file into a process' text regions. Of course, one may argue that this operation is also triggered by a system call (execve
), but this is usually known to be the one case where the process subsystem makes an implicit request to the file subsystem.
Because the process subsystem naturally has no way of handling files (otherwise there would be no point in dividing the whole thing in two), it has to use whatever the file subsystem provides to access files. This also means that the process subsystem is submitted to whatever measure the file subsystem takes regarding file edition/deletion. On this point, I would recommend reading Gilles' answer to this U&L question. The rest of my answer is based on this more general one from Gilles.
The first thing that should be noted is that internally, files are only accessible through inodes. If the kernel is given a path, its first step will be to translate it into a inode to be used for all other operations. When a process loads an executable into memory, it does it through its inode, which has been provided by the file subsystem after translation of a path. Inodes may be associated to several paths (links), and programs may only delete links. In order to delete a file and its inode, userland must remove all existing links to that inode, and ensure that it is completely unused. When these conditions are met, the kernel will automatically delete the file from disk.
If you have a look at the replacing executables part of Gilles' answer, you'll see that depending on how you edit/delete the file, the kernel will react/adapt differently, always through a mechanism implemented within the file subsystem.
- If you try strategy one (open/truncate to zero/write or open/write/truncate to new size), you'll see that the kernel won't bother handling your request. You'll get an error 26: Text file busy (
ETXTBSY
). No consequences whatsoever.
- If you try strategy two, the first step is to delete your executable. However, since it is being used by a process, the file subsystem will kick in and prevent the file (and its inode) from being truly deleted from disk. From this point, the only way to access the old file's content is to do it through its inode, which is what the process subsystem does whenever it needs to load new data into text sections (internally, there is no point in using paths, except when translating them into inodes). Even though you've unlinked the file (removed all its paths), the process can still use it as if you'd done nothing. Creating a new file with the old path doesn't change anything: the new file will be given a completely new inode, which the running process has no knowledge of.
Strategies 2 and 3 are safe for executables as well: although running executables (and dynamically loaded libraries) aren't open files in the sense of having a file descriptor, they behave in a very similar way. As long as some program is running the code, the file remains on disk even without a directory entry.
- Strategy three is quite similar since the
mv
operation is an atomic one. This will probably require the use of the rename
system call, and since processes can't be interrupted while in kernel mode, nothing can interfere with this operation until it completes (successfully or not). Again, there is no alteration of the old file's inode: a new one is created, and already-running processes will have no knowledge of it, even if it's been associated with one of the old inode's links.
With strategy 3, the step of moving the new file to the existing name removes the directory entry leading to the old content and creates a directory entry leading to the new content. This is done in one atomic operation, so this strategy has a major advantage: if a process opens the file at any time, it will either see the old content or the new content — there's no risk of getting mixed content or of the file not existing.
Recompiling a file : when using gcc
(and the behaviour is probably similar for many other compilers), you are using strategy 2. You can see that by running a strace
of your compiler's processes:
stat("a.out", {st_mode=S_IFREG|0750, st_size=8511, ...}) = 0
unlink("a.out") = 0
open("a.out", O_RDWR|O_CREAT|O_TRUNC, 0666) = 3
chmod("a.out", 0750) = 0
- The compiler detects that the file already exists through the
stat
and lstat
system calls.
- The file is unlinked. Here, while it is no longer accessible through the name
a.out
, its inode and contents remain on disk, for as long as they are being used by already-running processes.
- A new file is created and made executable under the name
a.out
. This is a brand new inode, and brand new contents, which already-running processes don't care about.
Now, when it comes to shared libraries, the same behaviour will apply. As long as a library object is used by a process, it will not be deleted from disk, no matter how you change its links. Whenever something has to be loaded into memory, the kernel will do it through the file's inode, and will therefore ignore the changes you made to its links (such as associating them with new files).
if they are read-only copies of something already on disc (like an executable, or a shared object file), they just get de-allocated and are reloaded from their source
, so I got the impression that if your binary is huge, then if part of your binary goes out of RAM, but is then needed again it is "reloaded from source" - so any changes in the.(s)o
file will be reflected during execution. But of course I may have misunderstood - which is why I am asking this more specific question – texasflood Mar 03 '15 at 22:33No, it only loads the necessary pages into memory. This is demand paging.
So I was actually under the impression that what I asked for cannot be guaranteed. – texasflood Mar 03 '15 at 22:34