What happens when I run the command cat /proc/cpuinfo?

Question

What happens when I write cat /proc/cpuinfo. Is that a named pipe (or something else) to the OS which reads the CPU info on the fly and generate that text each time I call it?

See also How frequently is the proc file system updated on Linux? — Gilles 'SO- stop being evil', Feb 23 '16 at 23:04

score 79 · Accepted Answer · answered Mar 28 '14 at 01:07

Whenever you read a file under /proc, this invokes some code in the kernel which computes the text to read as the file content. The fact that the content is generated on the fly explains why almost all files have their time reported as now and their size reported as 0 — here you should read 0 as “don't know”. Unlike usual filesystems, the filesystem which is mounted on /proc, which is called procfs, doesn't load data from a disk or other storage media (like FAT, ext2, zfs, …) or over the network (like NFS, Samba, …) and doesn't call user code (unlike FUSE).

Procfs is present in most non-BSD unices. It started its life in AT&T's Bell Labs in UNIX 8th edition as a way to report information about processes (and ps is often a pretty-printer for information read through /proc). Most procfs implementations have a file or directory called /proc/123 to report information about the process with PID 123. Linux extends the proc filesystem with many more entries that report the state of the system, including your example /proc/cpuinfo.

In the past, Linux's /proc acquired various files that provide information about drivers, but this use is now deprecated in favor of /sys, and /proc now evolves slowly. Entries like /proc/bus and /proc/fs/ext4 remain where they are for backward compatibility, but newer similar interfaces are created under /sys. In this answer, I'll focus on Linux.

Your first and second entry points for documentation about /proc on Linux are:

the proc(5) man page;
The /proc filesystem in the kernel documentation.

Your third entry point, when the documentation doesn't cover it, is reading the source. You can download the source on your machine, but this is a huge program, and LXR, the Linux cross-reference, is a big help. (There are many variants of LXR; the one running on lxr.linux.no is the nicest by far but unfortunately the site is often down.) A little knowledge of C is required, but you don't need to be a programmer to track down a mysterious value.

The core handling of /proc entries is in the fs/proc directory. Any driver can register entries in /proc (though as indicated above this is now deprecated in favor of /sys), so if you don't find what you're looking for in fs/proc, look everywhere else. Drivers call functions declared in include/linux/proc_fs.h. Kernel versions up to 3.9 provide the functions create_proc_entry and some wrappers (especially create_proc_read_entry), and kernel versions 3.10 and above provide instead only proc_create and proc_create_data (and a few more).

Taking /proc/cpuinfo as an example, a search for "cpuinfo" leads you to the call to proc_create("cpuinfo, …") in fs/proc/cpuinfo.c. You can see that the code is pretty much boilerplate code: since most files under /proc just dump some text data, there are helper functions to do that. There is merely a seq_operations structure, and the real meat is in the cpuinfo_op data structure, which is architecture-dependent, usually defined in arch/<architecture>/kernel/setup.c (or sometimes a different file). Taking x86 as an example, we're led to arch/x86/kernel/cpu/proc.c. There the main function is show_cpuinfo, which prints out the desired file content; the rest of the infrastructure is there to feed the data to the reading process at the speed it requests it. You can see the data being assembled on the fly from data in various variables in the kernel, including a few numbers computed on the fly such as the CPU frequency.

A big part of /proc is the per-process information in /proc/<PID>. These entries are registered in fs/proc/base.c, in the tgid_base_stuff array; some functions registered here are defined in other files. Let's look at a few examples of how these entries are generated:

cmdline is generated by proc_pid_cmdline in the same file. It locates te data in the process and prints it out.
clear_refs, unlike the entries we've seen so far, is writable but not readable. Therefore the proc_clear_refs_operations structures defines a clear_refs_write function but no read function.
cwd is a symbolic link (a slightly magical one), declared by proc_cwd_link, which looks up the process's current directory and returns it as the link content.
fd is a subdirectory. The operations on the directory itself are defined in the proc_fd_operations data structure (they're boilerplate except for the function that enumerates the entries, proc_readfd, which enumerates the process's open files) while operations on the entries are in `proc_fd_inode_operations.

Another important area of /proc is /proc/sys, which is a direct interface to sysctl. Reading from an entry in this hierarchy returns the value of the corresponding sysctl value, and writing sets the sysctl value. The entry points for sysctl are in fs/proc/proc_sysctl.c. Sysctls have their own registration system with register_sysctl and friends.

slm · Answer 2 · 2014-03-27T11:48:35.013

When trying to gain insight into what sort of magic is happening behind the scenes your best friend is strace. Learning to operate this tool is one of the best things you can do to get a better appreciation for what crazy magic is happening behind the scenes.

$ strace -s 200 -m strace.log cat /proc/cpuinfo
...
read(3, "processor\t: 0\nvendor_id\t: GenuineIntel\ncpu family\t: 6\nmodel\t\t: 37\nmodel name\t: Intel(R) Core(TM) i5 CPU       M 560  @ 2.67GHz\nstepping\t: 5\nmicrocode\t: 0x4\ncpu MHz\t\t: 1199.000\ncache size\t: 3072 KB\nphy"..., 65536) = 3464
write(1, "processor\t: 0\nvendor_id\t: GenuineIntel\ncpu family\t: 6\nmodel\t\t: 37\nmodel name\t: Intel(R) Core(TM) i5 CPU       M 560  @ 2.67GHz\nstepping\t: 5\nmicrocode\t: 0x4\ncpu MHz\t\t: 1199.000\ncache size\t: 3072 KB\nphy"..., 3464) = 3464
read(3, "", 65536)                      = 0
close(3)                                = 0
...

From the above output you can see that /proc/cpuinfo is just a regular file, or at least would appear to be one. So let's dig deeper.

Deeper dive

#1 - with ls..

Looking at the file itself it would appear to be "just a file".

$ ls -l /proc/cpuinfo 
-r--r--r--. 1 root root 0 Mar 26 22:45 /proc/cpuinfo

But take a closer look. We get our first hint that its special, note the size of the file is 0 bytes.

#2 - with stat..

If we now look at the file using stat we can get our next hint that there is something special about /proc/cpuinfo.

run #1

$ stat /proc/cpuinfo 
  File: ‘/proc/cpuinfo’
  Size: 0           Blocks: 0          IO Block: 1024   regular empty file
Device: 3h/3dInode: 4026532023  Links: 1
Access: (0444/-r--r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Context: system_u:object_r:proc_t:s0
Access: 2014-03-26 22:46:18.390753719 -0400
Modify: 2014-03-26 22:46:18.390753719 -0400
Change: 2014-03-26 22:46:18.390753719 -0400
 Birth: -

run #2

$ stat /proc/cpuinfo 
  File: ‘/proc/cpuinfo’
  Size: 0           Blocks: 0          IO Block: 1024   regular empty file
Device: 3h/3dInode: 4026532023  Links: 1
Access: (0444/-r--r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Context: system_u:object_r:proc_t:s0
Access: 2014-03-26 22:46:19.945753704 -0400
Modify: 2014-03-26 22:46:19.945753704 -0400
Change: 2014-03-26 22:46:19.945753704 -0400
 Birth: -

Notice the Access, Modify, & Change times? They keep changing for each access. This is highly unusual that all 3 would change like that. Unless edited a file's timestamp attributes typically stay the same.

#3 - with file..

Yet another clue that this file is anything but a regular file:

$ file /proc/cpuinfo 
/proc/cpuinfo: empty

If it were some manifestation of a named pipe it would show similar to one of these files:

$ ls -l /dev/initctl /dev/zero 
prw-------. 1 root root    0 Mar 26 20:09 /dev/initctl
crw-rw-rw-. 1 root root 1, 5 Mar 27 00:39 /dev/zero

$ file /dev/initctl /dev/zero 
/dev/initctl: fifo (named pipe)
/dev/zero:    character special

If we touch an emptyfile, /proc/cpuinfo does appear to be more like a file then a pipe:

$ touch emptyfile
$ ls -l emptyfile 
-rw-rw-r--. 1 saml saml 0 Mar 27 07:40 emptyfile
$ file emptyfile 
emptyfile: empty

#4 - with mount..

So at this point we need to take a step back and zoom out a bit. We're looking at a particular file but perhaps we should be looking at the filesystem this file resides on. And for this we can use the mount command.

$ mount | grep " /proc "
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)

OK, so the filesystem type is of type proc. So /proc is a different filesystem type, that's our hint that the files under /proc are special. They're not just your run of the mill files. So let's find out some more info about what makes the proc filesystem special.

Taking a look at mount's man page:

The proc filesystem is not associated with a special device, and when mounting it, an arbitrary keyword, such as proc can be used instead of a device specification. (The customary choice none is less fortunate: the error message `none busy' from umount can be confusing.)

And if we take a look at proc's man page:

The proc file system is a pseudo-file system which is used as an interface to kernel data structures. It is commonly mounted at /proc. Most of it is read-only, but some files allow kernel variables to be changed.

A little bit further down in that same man page:

/proc/cpuinfo

This is a collection of CPU and system architecture dependent items, for each supported architecture a different list. Two common entries are processor which gives CPU number and bogomips; a system constant that is calculated during kernel initialization. SMP machines have information for each CPU. The lscpu(1) command gathers its information from this file.

At the bottom of the man page is a reference to a kernel document which you can find here, titled: THE /proc FILESYSTEM. Quoting from that document:

The proc file system acts as an interface to internal data structures in the kernel. It can be used to obtain information about the system and to change certain kernel parameters at runtime (sysctl).

Conclusions

So what did we learn here? Well given that /proc is referred to as a pseudo filesystem and also a "interface to internal data structures" it's probably safe to assume that items within it are not actual files, but rather just manifestations made to look like files, but are really not.

I'll close with this quote which apparently used to be in a prior version of the man 5 proc from circa 2004 but for whatever reason is no longer included. NOTE: I'm not sure why it was removed since it describes very nicely what /proc is:

The /proc directory on GNU/Linux systems provides a file-system like interface to the kernel. This allows applications and users to fetch information from and set values in the kernel using normal file-system I/O operation.

The proc file system is sometimes referred to as a process information pseudo-file system. It does not contain ``real'' files but rather runtime system information (e.g. system memory, devices mounted, hardware configuration, etc). For this reason it can be regarded as a control and information center for the kernel. In fact, quite a lot of system utilities are simply calls to files in this directory. For example, the command lsmod, which lists the modules loaded by the kernel, is basically the same as 'cat /proc/modules' while lspci, which lists devices connected to the PCI bus of the system, is the same as 'cat /proc/pci'. By altering files located in this directory you can change kernel parameters while the system is running.

Source: The proc pseudo file-system

References

Procfs - wikipedia

Cool, :) this is the first thing I tried as I saw the question: strace -o catcpuproc.txt cat /proc/cpuinfo — Ketan, Mar 27 '14 at 03:18
Nice answer! On linux, if you want to dig deeper, the source for the proc filesystem is in fs/proc in the kernel source. You'll see that there is a fs/proc/cpuinfo.c but, unfortunately, it is rather empty since the heavy lifting is spread out all over arch/ as it is architecture dependent. For a simpler example see fs/proc/uptime.c. By glancing at the file we can guess that uptime_proc_show is the workhorse of what gets us the data we want and we could explore it more by diving into the functions it calls. To understand the seq_file interface and how it is used in procfs see: — Steven D, Mar 27 '14 at 04:08
https://www.linux.com/learn/linux-training/37985-the-kernel-newbie-corner-kernel-debugging-using-proc-qsequenceq-files-part-1 and http://lwn.net/Articles/22355/ (a bit dated) — Steven D, Mar 27 '14 at 04:09
@slm : +1, great answer. But to me, the first hint it's a special file is its size ^^ 0 bytes, yet you can cat lots of things from it (a bit like some pipe files). — Olivier Dulac, Mar 27 '14 at 09:16
@OlivierDulac - good point. I've made additional edits based on your feedback. LMK if I can make any further improvements. Thanks. — slm, Mar 27 '14 at 11:49
Another tiny bit of information. Running file with the -s option causes the file command to read files other than ordinary files and to disregard the size of the file as reported by stat. So file -s /proc/cpuinfo outputs ASCII text. :) — Slothworks, Jan 05 '16 at 16:52

score 15 · Answer 3 · answered Mar 27 '14 at 16:22

The answer given by @slm is very comprehensive, but I think a simpler explanation might come from a change in perspective.

In day-to-day usage we can think of files as physical things, ie. chunks of data stored on some device. This makes files like /proc/cpuinfo very mysterious and confusing. However, it all makes perfect sense if we think of files as an interface; a way to send data in and out of some program.

The programs which send and receive data in this way are filesystems or drivers (depending on how you define these terms, that might be too broad or too narrow a definition). The important point is that some of these programs use a hardware device to store and retrieve the data sent via this interface; but not all.

Some examples of filesystems which don't use a storage device (at least directly) are:

Filesystems using looked-up or calculated data. Proc is an example, since it gets data from various kernel modules. An extreme example is πfs ( github.com/philipl/pifs )
All FUSE filesystems, which handle the data with a regular userspace program
Filesystems which transform the data of another filesystem on-the-fly, for example using encryption, compression or even audio transcoding ( khenriks.github.io/mp3fs/ )

The Plan9 OS ( http://en.wikipedia.org/wiki/Plan_9_from_Bell_Labs ) is an extreme example of using files as a general programming interface.

What happens when I run the command cat /proc/cpuinfo?

3 Answers3

Deeper dive

/proc/cpuinfo

Conclusions

References

Linked

Related