0

Linux Kernel Subsystems

According to Anatomy of the Linux kernel, the Linux Kernel have five subsystems: Process Management, Memory Management, Network, VFS, Device.

Two of them essential: 1. Process Management 2. Memory Management

The essential Linux Kernel Subsystems exists to handle Network, VFS, Device?

What is the purpose of Process Management and Memory Management in Linux Kernel?

I am also trying to find the purpose of Linux Kernel Subsystems like VFS, Network, Device which exists beside other Linux Kernel Subsystems like Process Management and Memory Management.

Is it logical to have VFS, Network, Device in the Linux Kernel as Subsystems if VFS, Network, Device are handled by Process Management, Memory Management?

If a user runs a program in user-space then Process Management and Memory Management comes into picture. Can this program have something associated with VFS or Network or Device?

How do Process Management and Memory Management fit with the other Linux Kernel Subsystems?

sourcejedi
  • 50,249
  • 2
    Could you condense this into a single question? – Kusalananda Jan 19 '19 at 11:58
  • E.g. I don't know where your "essential" comes from. I suppose processes and memory management are essential, but VFS and devices are also 100% essential. – sourcejedi Jan 19 '19 at 13:18
  • I've attached an image that says Process Management as a name of one of the kernel subsystem. – Arshad Nazeer Jan 19 '19 at 15:34
  • Okay. So, all the subsystems of a Linux Kernel are equally important. – Arshad Nazeer Jan 19 '19 at 15:47
  • @ArshadNazeer if you are asking me to confirm your wording, then not quite. You can build Linux without networking support at all, and you will still be able to run some programs. (But less than you might think :-). https://cateee.net/lkddb/web-lkddb/NET.html ). Alternatively, if you want to correct or clarify to your question, then please use the "edit" link underneath it :-). – sourcejedi Jan 19 '19 at 16:20
  • @sourcejedi here is the link: https://www.ibm.com/developerworks/library/l-linux-kernel/ – Arshad Nazeer Jan 19 '19 at 19:07
  • There are plenty of books (Linux kernel development by Rober Love is considered by many to be a classic) or you can check web resources (for example http://tldp.org/LDP/tlk/tlk.html) - those will give you more complete answers. – peterph Jan 19 '19 at 23:34

1 Answers1

2

Note: The Linux kernel does not officially define or use the term "Process Management". It is okay to think about "Process Management", but people might disagree about exactly what it includes.

This answer got a bit more complicated, because I wanted to give you some common definitions and key words to look for. Sadly, some words are used to mean different things in different places.

Please read with care, and actively double-check how each word is being used. When writing, please make sure to provide some context. Do not assume that everybody uses the exact same definition.

As always, this answer is a simplification :-).

"Process" management

Your computer has processors, or "CPU"s, to execute code. Your computer might have 2 CPUs. The Linux CPU scheduler manages the CPU(s) to provide a much more useful concept: It shares the CPU(s) between any number of running "processes", or "threads" of execution. The scheduler forcibly switches between threads. It switches many times every second.

The scheduler also periodically considers whether it needs to assign threads to different CPUs. That is, it can "balance" the number of threads assigned to each CPU.

Traditionally, the CPU scheduler was sometimes referred to as the "process scheduler". However, that was when each user-level process had exactly one thread of execution. Nowadays a UNIX process might have many threads. Inside the kernel, it is quite common to refer simply to "the scheduler".

The scheduler uses the word "task" to refer to the kernel thread concept. Even so, the kernel quite often uses "process" to mean "task" (thread). When the difference is important, you must double-check the context and try not to get confused :-).

The core scheduling algorithm in kernel/sched/ does not depend on which family of CPU (CPU "architecture") you are using. The details of switching between processes are handled in arch-specific code, in arch/*/.

Another aspect of process management is when a process waits for an event. For example, when a process makes a system call to read from a device, it might have to wait for the device to signal when the data is ready (an interrupt signal). Until then, the process is removed from the run queue, and other processes can be scheduled on the CPU.

Memory management

Again, start by considering the hardware: physical RAM. The hardware provides one memory made up of bytes. Each byte has a numeric address and we can access it individually. Your computer might have about 1,000,000,000 bytes of RAM.

One aspect of Linux memory management is to provide every user process with a virtual memory of its own. Once again we are dividing hardware resources, and allocating parts of them for different purposes. The aim is to provide a new concept, which is much more pleasant to work with.

To understand why virtual memory is so useful, consider what happens without it. All processes would have access to the entire physical memory. This means the entire system can be corrupted by one running program, if it had a simple error, or was malicious.

When virtual memory was added to the original UNIX, it made the system massively more robust. Memory protection makes a nice combination with the forced task switching above (also called "pre-emptive" multi-tasking). Pre-emptive multi-tasking means that if one user process runs a continuous loop on your only CPU, the system will still allow other processes to run and respond to your input.

As mentioned above, a UNIX process might have many threads. But inside that process, all of the threads access the same virtual memory.

I refer to the concept of UNIX processes, because Linux-specific features can technically allow a lot of different possible combinations. The extra combinations are very rarely used. It is best to think in terms of the portable UNIX concepts.

Details of how the hardware MMU supports virtual memory are handled in arch-specific code. But there is still a lot of memory management code in mm/, which is used on all CPU architectures.

You can see there is some co-operation here between the scheduler code and VM code! When the kernel switches the CPU to a different thread of execution, it must update the associated MMU, so the thread will run in the correct virtual memory space.


If a user runs a program in user-space then Process Management and Memory Management comes into picture. Can this program have something associated with VFS or Network or Device?

On balance, "yes". The most accurate and useful answer is "yes".

open() is a system call into the VFS (fs/). It returns a "file descriptor" to the calling process. This is simply a number. For each process, the kernel keeps a table of the open files. E.g. when you call close(), you just pass the file descriptor, and the kernel looks it up in the table.

You could try to argue that you are going through a table owned by task_struct, and therefore you are really going through "Process Management" instead of directly to the VFS. However I would disagree. The open() and close() system calls are defined in fs/open.c. They are called with the numeric file descriptor and must look it up themselves.

The filename that you pass to open() may be a device node. In this case, operations on the returned file descriptor (including close()) will ultimately communicate with a device driver (drivers/).

Network connections are also represented by file descriptors. In most cases, the file descriptor is not obtained by open()ing a path on the filesystem. socket() is used instead. (Kernel source directory: net/)

sourcejedi
  • 50,249
  • What happens if there is VFS, Network Interface, Device in Linux Kernel? Why do I need Process and Memory management? – Arshad Nazeer Jan 19 '19 at 16:13
  • I mentioned VFS, Network, Device in the last section of my answer. 2) i) Task switching shares a single CPU between multiple processes, as explained. ii) You really want memory management because it stops simple error or malice in one process from overwriting the contents of all other processes.
  • – sourcejedi Jan 19 '19 at 17:58
  • I have two programs in user-space: 1xyz and 2xyz, these two programs can run simultaneously because there is a system call in 1xyz and 2xyz that makes the Kernel create process and allot memory so that I can switch between these tasks - Kernel creates processes and allots memory because Kernel is programmed to do so on system calls here. A user-space program with this system call can make me work with this subsystem? Eg: open() is the system call into VFS. System calls in user-space is responsible for everything that happens in Linux Kernel? Is this how application program works? – Arshad Nazeer Jan 19 '19 at 19:02
  • @ArshadNazeer a process is a running program. When you run 1xyz, you are creating a process for it to run in. There is not a separate step (e.g. system call) inside 1xyz that creates a process for itself. – sourcejedi Jan 19 '19 at 19:34
  • @ArshadNazeer Most application requests to the kernel are system calls, yes. If you can read a hand-written PDF, I recommend you go right now and have a play with the system call tracer strace, following Julia Evans 'zine. https://jvns.ca/strace-zine-unfolded.pdf – sourcejedi Jan 19 '19 at 19:43
  • There are a small number of other requests an app can make. One is a page fault - this happens automatically when you access a page of virtual memory which has not yet been allocated a physical RAM page. In some cases it requires reading from disk, so it can take a significant amount of time. strace does not show page faults – sourcejedi Jan 19 '19 at 19:43