31

Is it possible to have the system preemptively swap out inactive pages (vm.swappiness), but invoke the oom-killer when the system runs out of RAM (as opposed to running out of memory) and is forced to swap?

The ultimate goal is to keep the system from grinding to a halt when it starts thrashing the disk because of major page faults, but still let inactive pages get swapped out.

Another desire would be to configure how much swap memory the system is forced to use before oom-killer triggers. This way the system can dip into swap just a little bit, as long as it doesn't go too far. Or I could set such a threshold to trigger oom-killer before using all the RAM so there will always be room for the file system cache (and thus avoid more disk thrashing).

It doesn't seem like this would be that hard to do. It seems like you could just tell the oom-killer to trigger when the system has X ram used/free. But this is why I'm asking; I don't know.

For clarification, I am not looking to turn off swap, or adjust the vm.swappiness parameter

2
  • 3
    Interestingly enough, it happens even when there is no swap file. Apparently, readonly memory mapped files (like executables, libraries, perhaps graphic resources) are swapped out instead.
    – WGH
    Commented Jan 13, 2016 at 15:03
  • Facebook's oomd is a user-space daemon designed to kill processes based on overall system throughput (i.e., only when thrashing). But it seems pretty complicated to set up for desktops/workstations (which probably aren't putting tasks in cgroups or containers). Commented Aug 4, 2019 at 18:57

2 Answers 2

29

I also struggled with that issue. I just want my system to stay responsive, no matter what, and I prefer losing processes to waiting a few minutes. There seems to be no way to achieve this using the kernel oom killer.

However, in the user space, we can do whatever we want. So i wrote the Early OOM Daemon ( https://github.com/rfjakob/earlyoom ) that will kill the largest process (by RSS) once the available RAM goes below 10%.

Without earlyoom, it has been easy to lock up my machine (8GB RAM) by starting http://www.unrealengine.com/html5/ a few times. Now, the guilty browser tabs get killed before things get out of hand.

1
  • 1
    Thanks, that's exactly what I was looking for. I can now keep running column -t -s, on some huge csv files and let earlyoom kill it when that's not possible, before noticing any unresponsiveness.
    – henfiber
    Commented Jun 24, 2015 at 5:48
3

This sounds like an overly elaborate solution. I would suggest (and I do this on machines I setup which don't need to hibernate) simply allocating a small amount of swap space (128-256MiB). This way the kernel can swap some pages out, but the OOM-killer gets invoked before things get bad.

If you really want to do this I think you'll need to write your own script/program which monitors swap usage and invokes the OOM-killer using the Magic SysReq key (which can be done programmatically by writing to /proc/sysrq-trigger).

4
  • 2
    I would argue that having a small swap isnt a very elegant solution. You basically end up limiting the usefulness of your swap. What if you have a lot of inactive pages and would benefit from having 10gb of swap around? I have boxes with ~100gb of ram where 10gb of swap isnt a far fetched idea. And writing an application to do this in userspace is just open to problems (compared to natively in the kernel).
    – phemmer
    Commented May 13, 2012 at 6:40
  • 2
    Because then you essentially need a mechanism to distinguish "good swapping" from "bad swapping", and that's a difficult algorithm to devise. The amount of swap which is appropriate obviously depends on the amount of RAM and the workload you're running, so if 10GiB is appropriate for your machines then allocate that :-)
    – mgorven
    Commented May 13, 2012 at 7:13
  • 1
    Why would that be difficult? There are only 2 types of swap, preemptive swap due to vm.swappiness, and forced swapping due to running out of ram. All that needs to happen is when the kernel is forced to swap, to trigger oom-killer. And 10gb also leaves tons of space for forced swapping to thrash the disk.
    – phemmer
    Commented May 13, 2012 at 16:12
  • 1
    For a modern solution, you want to subscribe for memory pressure events from the kernel and listen for the level you want or do some heuristic measurements when you receive pressure events. The hard part is to accurately decide when the runtime performance is getting low enough to warrant killing a process. I've written a test program that shows that e.g. I can notice high latency for the GUI whenever mmap() performance slows down even 25% despite the system still having gigs of MemAvailable. Also note that any diagnostics you run may cause more slowdown for the system! Commented Feb 17, 2023 at 15:08

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .