6

By a process tree I mean the process and everything it executes in any way.

I tried /usr/bin/time -v, but the results are entirely wrong. For example running a npm test in one of our projects with 14GiB free RAM and 8GiB of free swap results in OOM killer starting to kill my applications (most commonly a browser and IDE). time reports only 800MiB was used, even though the real memory consumption must have been very high, over 20GiB...

1 Answer 1

4
+50

First I would pickup the process with the smem. For example, you could use smem -tas uss to have an overview.

  • -t ... shows totals
  • -a ... auto adjusts the column width
  • -s uss... sorts the result based on the uss column

To see details per process the best way is to use pmap. To get detailed information you should use -X switch. To get all information that kernel provides you can use -XX which usually is an overkill.

To get a 2 seconds refresh monitoring for a pid 3120:

watch -n 2 pmap -X 3120

Edit: To actually get a peak The above helps in monitoring, but it does not show an actual peak. Slipped my mind.

I would personally use valgrind with the massif tool.

valgrind --tool=massif --pages-as-heap=yes --massif-out-file=evolution_massif.out evolution; grep mem_heap_B evolution_massif.out | sed -e 's/mem_heap_B=\(.*\)/\1/' | sort -g | tail -n 1

Explanation:

  • --page-as-heap=yes ... tells massif that it should take all memory instead of just heap
  • --massif-out-file ... the output file for the massif tool
  • evolution ... the application that should be monitored

The next part is there to find out the maximum number recorded.

The grep searches for mem_heap_B occurrences. sed -e gets rid of the string mem_heap_B so we get only numeric result. The we sort it via sort -g which is generic numeric sort and take the biggest number with tail -n 1 which returns the first line of the sorted numbers.

After the application is terminated valgrid will printout one number, which is the memory peak in [B]ytes.

Check the memory peak logging

To display the output file evolution_massif.out, you can use ms_print post-processing tool for Massif.

It is simply:

ms_print evolution_massif.out

The output should look like this (this is first page of the output), where you can see which snapshot was a peak - 10 (peak):

--------------------------------------------------------------------------------
Command:            evolution
Massif arguments:   --pages-as-heap=yes --massif-out-file=massif.out
ms_print arguments: massif.out
--------------------------------------------------------------------------------


    GB
10.09^        #                                                               
     |        #                                                               
     |        #                                                               
     |        #                                                               
     |        #                                                               
     |        #                                                               
     |        #      @:::::@::::::::::::::::@@:::::@:::::@::::::@::::@:::::@::
     |        #      @:::::@: :::::::: :::: @@:::::@:::::@::::::@::::@:::::@::
     |        #::::::@:::::@: :::::::: :::: @@:::::@:::::@::::::@::::@:::::@::
     |        #::::: @:::::@: :::::::: :::: @@:::::@:::::@::::::@::::@:::::@::
     |        #::::: @:::::@: :::::::: :::: @@:::::@:::::@::::::@::::@:::::@::
     |        #::::: @:::::@: :::::::: :::: @@:::::@:::::@::::::@::::@:::::@::
     |        #::::: @:::::@: :::::::: :::: @@:::::@:::::@::::::@::::@:::::@::
     |        #::::: @:::::@: :::::::: :::: @@:::::@:::::@::::::@::::@:::::@::
     |        #::::: @:::::@: :::::::: :::: @@:::::@:::::@::::::@::::@:::::@::
     |        #::::: @:::::@: :::::::: :::: @@:::::@:::::@::::::@::::@:::::@::
     |        #::::: @:::::@: :::::::: :::: @@:::::@:::::@::::::@::::@:::::@::
     |        #::::: @:::::@: :::::::: :::: @@:::::@:::::@::::::@::::@:::::@::
     |     @  #::::: @:::::@: :::::::: :::: @@:::::@:::::@::::::@::::@:::::@::
     |  :::@::#::::: @:::::@: :::::::: :::: @@:::::@:::::@::::::@::::@:::::@::
   0 +----------------------------------------------------------------------->Gi
     0                                                                   8.927

Number of snapshots: 97
 Detailed snapshots: [7, 10 (peak), 17, 24, 43, 44, 50, 60, 70, 80, 90]

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
  0              0           16,384           16,384             0            0
  1    149,772,704      393,547,776      393,547,776             0            0
  2    243,902,287      398,592,800      398,592,800             0            0
  3    396,613,298      558,157,704      558,157,704             0            0
  4    504,752,503      638,138,760      638,138,760             0            0
  5    604,812,936      639,894,808      639,894,808             0            0
...

Edit to add all children:

To add all descendants (children) you can add the --trace-children=yes option to the valgrind command.

3
  • 1
    This would dump a whole lot of numbers to the screen every two seconds. How could you modify this to give the user the peak RAM usage, which is what they are actually asking about?
    – Kusalananda
    Commented Feb 6, 2023 at 10:13
  • Also note that the question is about a process tree (i.e. a process and all its children and descendants), not about a single process.
    – CAFxX
    Commented Feb 6, 2023 at 13:36
  • @CAFxX If you are talking about valgrind, see my edit. To measure all the descendants you simply add the --trace-children=yes parameter. The Kusalananda comment was before I added the valgrind information.
    – tukan
    Commented Feb 6, 2023 at 14:05

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .