1

I just took a look at the output of top and it showed me (amongst other processes) the following:

output from top

As once can see I have ten processes consuming approx 10GB each so 100GB in total. The computer however has only 64GB of memory as can be seen in the second line from the top. Of which currently about 22GB are used.

Now the solution to this puzzle: The test_mpi.out processes use a large amount of shared memory amongst each other. Since I have the source code I know that the actual memory consumption is about 10GB.

The computer consumes about 12GB of memory when idle, so this sums up to the reported 22GB.

What I don't understand is how top knows that only 22GB of memory are actually used. Based on all the columns displayed (VIRT,RES and SHR) top should not be able to figure this out. It would be awesome if someone could shed some light on this.

EDIT: running on redhat linux

EDIT: Thanks to Michael Homer I know now that top takes this information from /proc/meminfo. But I was rather hoping for an answer that would explain how I can determine that all the test_mpi.out processes only consume 10GB in total (instead of the 100gb suggested by naively adding the output of top). I tried looking at /proc/PID/status but I didn't find any clues on how I could determine the actual memory usage of several processes that share a large memory segment (if I wouldn't have the source code).

3
  • 1
    This is going to be platform-dependent (so you may want to edit for a particular system), but it's almost certainly just asking the kernel how much memory is used and putting that on the screen for you. It doesn't have to figure it out. If you're on Linux, for example, cat /proc/meminfo will tell you that whole line and then some. Commented Jan 12, 2016 at 0:42
  • Your edited question seems to have nothing to do with top or the original question at all. Commented Jan 12, 2016 at 1:11
  • Yes that is true, I apologize for not clearing this up from the beginning. I have edited the title to reflect the new direction of the question. I leave the output of top as it serves as a good example to illustrate the problem.
    – ftiaronsem
    Commented Jan 12, 2016 at 1:48

1 Answer 1

2

You can see that the SHR column is displaying the same amount of memory as RES - this means that practically 100% of that particular task's resident memory consists of shared memory segments. Even that is not giving you full insight though, as RES is just the amount of memory that is not paged out.

To figure out what the actual memory consumption of a process is, try using a more intelligent method such as ps(1) with select fields (look into the -o option and the STANDARD FORMAT SPECIFIERS section of the manual).

Do note that apart from the size field (which is a very rough estimate) the total size of a process in terms of memory is difficult to assess at its face value precisely because some of its memory might be shared with other processes and we can't really tell straight away how many of those pages are dirty (see the NOTES section's paragraph on SIZE and RSS in the manual page).

To be able to tell that those processes are actually using the same shared memory segments, you need to look at the output of ipcs(1) command and look into shared memory segments that have a suspicious number of nattch processes.

Then you need to look at those segments' detailed information using ipcs -m -i <id> that will display the actual processes attached.

If this sounds too complicated, look at it as a challenge excercise - write a script that, given a process ID, prints the list of SHM IDs this process is attached to, and a list of PIDs that it is sharing those SHM IDs with. ;)

1
  • Do note - after kicking myself in the teeth, I have to point out that even ipcs will not display all process IDs attached to a particular SHM ID. See stackoverflow.com/questions/5658568/… for further discussion. Commented Jan 12, 2016 at 1:54

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .