6

Please see the following figure showing the memory layout of a process:

:

When someone forks(), and a new task_struct is assigned, what happens to the addresses of the process? In more or less other words: Imagine there is one process, so the image below holds. Now suppose I do a fork. What happens?

Michael Mrozek
  • 93,103
  • 40
  • 240
  • 233
Dervin Thunk
  • 3,529

4 Answers4

3

After fork, you have two copies of the same program. The kernel can either copy all of the address space or copy-on-write. In the latter case, the text and data sections will probably always be shared by both processes, and the stack will be copied IF the child needs to modify it, so on and so on.

2

When a process is forked, linux does a very minimal amount of copying, and utilizes a copy-on-write method. This copy on write means that if both processes (the parent and the child) are doing reads, they will read from the exact same blocks of memory. Once one of them writes to that memory, it is copied and no longer shared.

Now the programs do not know this is happening. This is because the kernel maintains a page table for each process. When the process says 'I want to access memory 0xbeef' the kernel remaps that into an actual location in physical memory. This is necessary because the program will store these addresses in variables, so when the program is forked, it cant know if or where data in its memory gets moved to (all those addresses stored in the variables have to continue to be valid).
This is also what enables swap to work. The kernel can take the physical memory holding the data and store it to a disk, but the program will still reference address 0xbeef, and the kernel will translate that.

So the absolute minimum that the kernel ends up copying is the page table that does this address mapping, and the task structure (covers opened files, process state, pending signals, etc).

phemmer
  • 71,831
1

As each process is assigned his own address space ("virtual memory"), the addresses will probably remain the same, but refer to different memory addresses in a translation table (when modified). From the process point-of-view, nothing happens to the addresses it uses and sees.

njsg
  • 13,495
0

The whole address space is cloned. In other words, that picture describes both processes after the fork. After that, the processes diverge from one another as they each change things in different ways.

psusi
  • 17,303
  • I'm slow today. So, you mean to say that this memory layout is for all processes, meaning that when you fork, each new process will point to some portion of each segment. For instance, program1 will have address 0x1 in the text segment, while program2 will have address 0xA and so on for the diff segments. One "memory layout" but several processes inside? Is that it? – Dervin Thunk Feb 10 '12 at 18:46
  • @DervinThunk, each process has its own address space that is separate from the others, except for the kernel space. Each one looks roughly like that picture. All processes share the same kernel space. – psusi Feb 10 '12 at 22:11
  • ok, so in the pic it's just one huge process (~3 Gigs). Thanks. – Dervin Thunk Feb 10 '12 at 23:03
  • @DervinThunk, on a 32 bit CPU, the address space is 4 GB. Typically the kernel reserves the upper 1 GB as the map shows. The rest is specific to the process. – psusi Feb 11 '12 at 02:09
  • This is not entirely correct. The whole address space is not cloned. Linux uses a copy-on-write mechanism so that an absolute minimum of data is duplicated on fork. See the notes section of man 2 fork – phemmer Feb 11 '12 at 02:32
  • @Patrick, it's a matter of perspective really. The virtual address space is cloned, even though physically the pages are not copied until one of the processes modifies them. – psusi Feb 11 '12 at 03:16