1

I have a few questions about the process information avaiable in GNU/Linux via procfs. This was originally prompted by a desire to extract vmPeak, vmSize, vmRSS & vmHWM from within an application.

I started with the working assumption that /proc/<pid>/status is a human readable version of /proc/<pid>/stat which is machine readable as per the kernel.org commentry:

stat - Process status

status - Process status in human readable form

I realised this was not quite correct when I noticed vmPeak is only available from /proc/pid/status.

It seems that /proc/pid/status actually combines values from several places and adds some of its own.

Given we have /proc/pid/status if there any reason to use /proc/pid/stat at all? What needs it? Why have two APIs? Could /proc/pid/stat be deprecated or does it have a use?

stat is not equivalent. It has less fields on offer. It is only slightly easier to parse (with a subtle bug if you do it naively). Any programs using stat could easily switch to using status instead. How many would really break?

I have just written parsers for both (though ultimately I binned the one for stat as the API is less useful). For machine readable there is not much in it. In fact the parser for 'status' ends up being more elegant as you can read it directly into any kind of key value store you like. Status seems easier to parse from any language and extensible.

How many programs actually depend on 'stat' rather than 'status'? Do any of them really need the trivial parsing speed up that this might offer?

Now I understand that stat couldn't be removed for years because of backwards compatibility but you could say 'this is now deprecated' unless there is a very good reason to keep it (which would be one possible answer to my question).

If performance is an issue surely converting this kernel information to text and back via a virtual file system is less performant than a library call would be.

It may be obnoxious to keep adding new APIs as this answer suggests but given that a great deal of this is stable why isn't there C library API like for example, sysinfo?

4
  • what is the problem with having /proc/pid/stat? ... is it causing errors?
    – jsotola
    Commented Apr 14, 2022 at 1:45
  • Why have two APIs to do the same thing. Its a question of which API is best. Commented Apr 14, 2022 at 1:54
  • 1
    Given how Linus is notorious about not allowing userspace to be broken, what makes you think any non-zero answer to "How many would really break?" is acceptable?
    – muru
    Commented Apr 14, 2022 at 2:05
  • I am not and have never proposed removing it for that exact reason. I simply wanted to understand if one way was now considered "best practice". If one interface is labelled deprecated it sends a clear message to application developers that they should use the other one. This is also useful to the kernel developers. Anyway I've changed the title and text to try and emphasize that. Commented Apr 14, 2022 at 9:43

3 Answers 3

2

The reason the kernel still provides /proc/…/stat is backwards compatibility, and not only with old versions of programs — if you build the procps utilities right now, you’ll end up with programs (ps, pgrep, pidof, etc.) which still read /proc/…/stat.

One could conceivably change procps to only use /proc/…/status; the old performance argument is no longer relevant, it takes the same amount of time to retrieve status from the kernel as it does to retrieve stat. But that doesn’t help existing systems that want a newer kernel without changing their user-space tools.

As far as the kernel is concerned, that’s a good enough reason to keep stat. Why is there a Linux kernel policy to never break user space?

You are of course free to choose to only use /proc/…/status and avoid /proc/…/stat entirely. I’m not aware of any general consensus that the latter should be considered deprecated; I’ve never seen it discussed (which doesn’t mean it hasn’t been), and it’s not marked as deprecated in the procfs man page or in the kernel’s obsolete ABI symbols (which includes /proc entries). Perhaps this is just inertia, and if you brought it up in circles where more kernel developers were likely to notice, it would become apparent that there is consensus.

(Note that some fields in stat aren’t available in status, as far as I can tell — at least the process group and session ids.)

Regarding a sysinfo-style interface, you could always suggest one. The text-based interface won’t go away, not only to preserve backwards compatiblity; having this information in a format consumable by the many text-processing tools in a Unix-style system is too convenient to get rid of.

1

https://lkml.org/lkml/2012/12/23/75

WE DO NOT BREAK USERSPACE!

As long as there are old utilities/applications which rely on stat it will not be removed, IOW it will mostly likely never be removed.

If you want to use either - it's your choice.

4
  • It's weird that tons of programmers who have just started to learn the internals of Linux want to remove something because "I've got this crazy insight how to improve things". You don't. You want to break stuff. Stop. Fix bugs instead. There are literally thousands of unresolved bugs at bugzilla.kernel.org. You really wanna be helpful - implement revoke() - this is a hella complex issue bugzilla.kernel.org/show_bug.cgi?id=14505 - it's one of the absolute worst things about the Linux kernel. Commented Apr 14, 2022 at 5:57
  • Also please unlearn to use "why do we still have" - you're not talking about we you're talking solely about yourself. Commented Apr 14, 2022 at 5:59
  • I am not and have never proposed removing it for that exact reason. I simply wanted to understand if one way was now considered "best practice". If one interface is labelled deprecated it sends a clear message to application developers that they should use the other one. Your rant is completely misplaced here. Also I am not a new user. I have copy of "linux kernel internals" somewhere I bought in the last millenium before such information was so easily available on the net. Commented Apr 14, 2022 at 9:26
  • 1. I simply wanted to understand if one way was now considered 'best practice' it looked to me you wanted to remove the old one. 2. "If one interface is labelled deprecated" - where did you learn this about stat? 3. "Your rant is completely misplaced here. I gave the exact reason why a) it's not deprecated b) it will not be removed c) You can use either. Commented Apr 14, 2022 at 12:22
-3

This is only a personal opinion but I believe /proc/pid/stat should be considered deprecated and you should use /proc/pid/status in all cases instead.

stat is not much more efficient to parse and it has a subtle danger which can lead to bugs but can even pose security risks (see this for example). It also contains less fields than status.

See:

6
  • Please explain the downvotes here. I do not want to accept my own answer if it is widely considered to be wrong. Is it because I am being presumptive? It is not my place to deprecate an API? Commented Apr 16, 2022 at 11:26
  • 3
    "I believe you /proc/pid/stat should be considered deprecated" - This is quite strong suggestion. And it conflicts with "we do not break userspace" rule. Most likely, this is the reason you got downvotes.
    – Tsyvarev
    Commented Apr 16, 2022 at 11:48
  • You can just say you think it's dangerous to use instead of this nonsense about "considered deprecated" when it mostly definitely isn't and probably never will be deprecated. Why so hung up on deprecation?
    – muru
    Commented Apr 16, 2022 at 11:52
  • Bad grammar corrected. Emphasis removed. Does deprecation break userspace? Only the actual removal of the interface would break it. Surely decouraging use of something would not. Commented Apr 16, 2022 at 11:53
  • Deprecation to me means "do not use this because there is something better". Commented Apr 16, 2022 at 11:54

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .