1

I have a few questions about the process information avaiable in GNU/Linux via procfs. This was originally prompted by a desire to extract vmPeak, vmSize, vmRSS & vmHWM from within an application.

I started with the working assumption that /proc/<pid>/status is a human readable version of /proc/<pid>/stat which is machine readable as per the kernel.org commentry:

stat - Process status

status - Process status in human readable form

I realised this was not quite correct when I noticed vmPeak is only available from /proc/pid/status.

It seems that /proc/pid/status actually combines values from several places and adds some of its own.

Given we have /proc/pid/status if there any reason to use /proc/pid/stat at all? What needs it? Why have two APIs? Could /proc/pid/stat be deprecated or does it have a use?

stat is not equivalent. It has less fields on offer. It is only slightly easier to parse (with a subtle bug if you do it naively). Any programs using stat could easily switch to using status instead. How many would really break?

I have just written parsers for both (though ultimately I binned the one for stat as the API is less useful). For machine readable there is not much in it. In fact the parser for 'status' ends up being more elegant as you can read it directly into any kind of key value store you like. Status seems easier to parse from any language and extensible.

How many programs actually depend on 'stat' rather than 'status'? Do any of them really need the trivial parsing speed up that this might offer?

Now I understand that stat couldn't be removed for years because of backwards compatibility but you could say 'this is now deprecated' unless there is a very good reason to keep it (which would be one possible answer to my question).

If performance is an issue surely converting this kernel information to text and back via a virtual file system is less performant than a library call would be.

It may be obnoxious to keep adding new APIs as this answer suggests but given that a great deal of this is stable why isn't there C library API like for example, sysinfo?

  • what is the problem with having /proc/pid/stat? ... is it causing errors? – jsotola Apr 14 '22 at 01:45
  • Why have two APIs to do the same thing. Its a question of which API is best. – Bruce Adams Apr 14 '22 at 01:54
  • 1
    Given how Linus is notorious about not allowing userspace to be broken, what makes you think any non-zero answer to "How many would really break?" is acceptable? – muru Apr 14 '22 at 02:05
  • I am not and have never proposed removing it for that exact reason. I simply wanted to understand if one way was now considered "best practice". If one interface is labelled deprecated it sends a clear message to application developers that they should use the other one. This is also useful to the kernel developers. Anyway I've changed the title and text to try and emphasize that. – Bruce Adams Apr 14 '22 at 09:43

3 Answers3

2

The reason the kernel still provides /proc/…/stat is backwards compatibility, and not only with old versions of programs — if you build the procps utilities right now, you’ll end up with programs (ps, pgrep, pidof, etc.) which still read /proc/…/stat.

One could conceivably change procps to only use /proc/…/status; the old performance argument is no longer relevant, it takes the same amount of time to retrieve status from the kernel as it does to retrieve stat. But that doesn’t help existing systems that want a newer kernel without changing their user-space tools.

As far as the kernel is concerned, that’s a good enough reason to keep stat. Why is there a Linux kernel policy to never break user space?

You are of course free to choose to only use /proc/…/status and avoid /proc/…/stat entirely. I’m not aware of any general consensus that the latter should be considered deprecated; I’ve never seen it discussed (which doesn’t mean it hasn’t been), and it’s not marked as deprecated in the procfs man page or in the kernel’s obsolete ABI symbols (which includes /proc entries). Perhaps this is just inertia, and if you brought it up in circles where more kernel developers were likely to notice, it would become apparent that there is consensus.

(Note that some fields in stat aren’t available in status, as far as I can tell — at least the process group and session ids.)

Regarding a sysinfo-style interface, you could always suggest one. The text-based interface won’t go away, not only to preserve backwards compatiblity; having this information in a format consumable by the many text-processing tools in a Unix-style system is too convenient to get rid of.

Stephen Kitt
  • 434,908
1

https://lkml.org/lkml/2012/12/23/75

WE DO NOT BREAK USERSPACE!

As long as there are old utilities/applications which rely on stat it will not be removed, IOW it will mostly likely never be removed.

If you want to use either - it's your choice.

  • It's weird that tons of programmers who have just started to learn the internals of Linux want to remove something because "I've got this crazy insight how to improve things". You don't. You want to break stuff. Stop. Fix bugs instead. There are literally thousands of unresolved bugs at bugzilla.kernel.org. You really wanna be helpful - implement revoke() - this is a hella complex issue https://bugzilla.kernel.org/show_bug.cgi?id=14505 - it's one of the absolute worst things about the Linux kernel. – Artem S. Tashkinov Apr 14 '22 at 05:57
  • Also please unlearn to use "why do we still have" - you're not talking about we you're talking solely about yourself. – Artem S. Tashkinov Apr 14 '22 at 05:59
  • I am not and have never proposed removing it for that exact reason. I simply wanted to understand if one way was now considered "best practice". If one interface is labelled deprecated it sends a clear message to application developers that they should use the other one. Your rant is completely misplaced here. Also I am not a new user. I have copy of "linux kernel internals" somewhere I bought in the last millenium before such information was so easily available on the net. – Bruce Adams Apr 14 '22 at 09:26
  • I simply wanted to understand if one way was now considered 'best practice' it looked to me you wanted to remove the old one. 2. "If one interface is labelled deprecated" - where did you learn this about stat? 3. "Your rant is completely misplaced here. I gave the exact reason why a) it's not deprecated b) it will not be removed c) You can use either.
  • – Artem S. Tashkinov Apr 14 '22 at 12:22