Even though the topic is quite old, I want to share another project that emerged from the cgroups Linux kernel feature.
https://github.com/gsauthof/cgmemtime:
cgmemtime measures the high-water RSS+CACHE memory usage of a process and its descendant processes.
To be able to do so it puts the process into its own cgroup.
For example process A allocates 10 MiB and forks a child B that allocates 20 MiB and that forks a child C that allocates 30 MiB. All three processes share a time window where their allocations result in corresponding RSS (resident set size) memory usage.
The question now is: How much memory is actually used as a result of running A?
Answer: 60 MiB
cgmemtime is the tool to answer such questions.
Usage examples would be:
$ sudo ./cgmemtime --setup -g <myusergroup> --perm 775
$ ./cgmemtime ./testa x 10 20 30
Parent PID is 27189
Allocating 10 MiBs
New Child: 27193
Allocating 20 MiBs
New Child: 27194
Allocating 30 MiBs
Child user: 0.000 s
Child sys : 0.005 s
Child wall: 6.006 s
Child high-water RSS : 11648 KiB
Recursive and acc. high-water RSS+CACHE : 61840 KiB
$ ./cgmemtime python -c 'print range(100000)[48517]'
48517
Child user: 0.014 s
Child sys : 0.014 s
Child wall: 0.029 s
Child high-water RSS : 9948 KiB
Recursive and acc. high-water RSS+CACHE : 5724 KiB