7

kvm version: QEMU emulator version 1.1.2 (qemu-kvm-1.1.2+dfsg-6+deb7u3, Debian), Copyright (c) 2003-2008 Fabrice Bellard
libvirtd version: libvirtd (libvirt) 0.9.12.3
debian version: 7.5

I'm running multiple VMs on a 16GB RAM machine, all together they use ~9GB RAM.

Every now and then the linux oom killer comes over and kills a process. I guess it chooses the process using most of the memory - in this case a 6GB Windows VM:
[431215.778365] Out of memory: Kill process 25086 (kvm) score 192 or sacrifice child

IMHO the machine shouldn't be in an OOM Situation as there are ~6.6GB cached memory available. You can see the memory distribution and the resulting oom kill here:

memory distribution

I now have set the oom_adj for the pid of the kvms to -17, so the oom-killer won't kill this process.

But I'm still failing to understand why the kernel thinks it has to kill a process and won't go ahead and free some cached memory.

  • Can anyone explain why this happens?
  • Can you tell me how I can prevent the oom killer from killing my kvm processes without knowing the PID?
2
  • Sounds like this bug to me: bugzilla.redhat.com/show_bug.cgi?id=903432
    – slm
    Commented Jul 2, 2014 at 12:59
  • Also check for a full tmpfs, depending how you check it may count as cache/buffer even though it can't be freed short of swapping it. Commented Jul 2, 2014 at 13:57

1 Answer 1

2

Just disable the OOM Killer for the particular process with:

for p in $(pidof kvm qemu-system32_x64); do
  echo -n '-17' > /proc/$p/oom_adj
done

or by flavor oom_score adj.

However:

Out of memory: Kill process 25086 (kvm) score 192 or sacrifice child

In your case is to set also to 192.

See also Taming the OOM Killer

In any case, you should check also what causes the memory overflow, since the OOM Killer will kill other important processes.

Often it is observed a phenomenon called overtuning. In this case the overcommit_memory as described here.

Source proc filesystems:

oom_adj:

For backwards compatibility with previous kernels, /proc/<pid>/oom_adj may also
be used to tune the badness score.  Its acceptable values range from -16
(OOM_ADJUST_MIN) to +15 (OOM_ADJUST_MAX) and a special value of -17
(OOM_DISABLE) to disable oom killing entirely for that task.  Its value is
scaled linearly with /proc/<pid>/oom_score_adj.

oom_score_adj:

The value of /proc/<pid>/oom_score_adj is added to the badness score before it
is used to determine which task to kill.  Acceptable values range from -1000
(OOM_SCORE_ADJ_MIN) to +1000 (OOM_SCORE_ADJ_MAX).  This allows userspace to
polarize the preference for oom killing either by always preferring a certain
task or completely disabling it.  The lowest possible value, -1000, is
equivalent to disabling oom killing entirely for that task since it will always
report a badness score of 0.
2
  • 1
    First of: oom_adj is deprecated, the new file is oom_score_adj. Also this doesn't solve the issue that there is cached memory avaliable (alongside 16GB of swap) which is never touched.
    – Momo
    Commented Jul 2, 2014 at 10:32
  • thanks, completed. Did you scored to 192 as it is written? What causes the OOM Killer to kill tasks?
    – user55518
    Commented Jul 2, 2014 at 11:16

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .