34

I am using Debian sid, hard drive formatted with ext4, running on linux 3.1

I remember on previous linux versions (maybe before 3.0), if I run out of memory, and swap is not enabled, programs will usually crash. This is perfect for my environment: simple web browsing with no critical operations. That is, if I accidentally run across a bad website which uses up too much memory, it just crashes without rendering my terminal unusable.

But in my current setup, the computer hangs with violent I/O throughput in the background. iotop reveals kswapd0 to be the culprit, which means it is due to swapping. After using swapon -s to determine any swaps that were enabled, I used swapoff -a to disable all swaps and swapon -s again to confirm that all swaps were disabled.

Then I tried maximizing my memory usage again. Alas, the behavior I expected didn't happen. Instead, kswapd0 tries over and over to swap out the RAM and fails as there is no swap space. Because it never gives up, my computer is locked in eternal I/O heavy freeze, bad for my disk's health.

Am I doing something wrong in trying to swapoff -a? Why is the behavior different than what it used to be (probably pre-3.0 times)?

11
  • That doesn't really make sense. Doing the swapoff -a itself, if there was stuff in the swap, will generate a lot of I/O (and can result in processes getting killed if there is not enough real RAM availabe). Are you sure it's not the swapoff -a that caused the I/O "storm"?
    – Mat
    Commented Nov 15, 2011 at 11:40
  • 1
    I suppose it is enough to comment the fstab line about swap. Try if the behavior is the same.
    – enzotib
    Commented Nov 15, 2011 at 11:48
  • @Mat swapoff -a should disable swap permanently, meaning it should stay disabled after next reboot. I confirmed this. Yet, I/O "storm" still happens during the session after next reboot. For the record, I/O "storm" didn't happen at the moment I did swapoff -a because swap was 0 at that time.
    – syockit
    Commented Nov 15, 2011 at 11:48
  • 10
    @syockit: swapoff -a is not permanent.
    – Mat
    Commented Nov 15, 2011 at 11:51
  • 1
    database loading got to about 15% in 14 hours. Turned off swap, and on the next attempt it's gotten to 40% in 4 hours. admittedly, the server is under-powered and lown on ram, but without swap turned on OpenSuSE works much faster for this one process. The OS's opinion of "better" and mine differ dramatically during a simple mysql db load. commented out the swap drive in /etc/fstab and rebooted. Commented Jun 23, 2019 at 16:13

6 Answers 6

19

Disabling swap won't do what you want. You will still get violent I/O throughput, but it will be of clean pages rather than dirty ones.

With no swap, the system will compress the cache of clean (unmodified) pages to near zero, because those are the only pages it can evict from physical memory. It can only evict dirty (modified) pages from memory by writing them to swap, with no swap, it has no way to evict dirty pages.

As you run low on physical memory, each process will have to load its code pages from disk as it evicts the previous process code pages. The result will be violent thrashing and excessive work done by the swap subsystem.

This is a special case of a very important principle: For a well-designed system, you can't make it run better by reducing its choices. Linux is a well-designed system. Removing swap just gives it fewer choices, so it's not surprising that it behaves worse.

16
  • 2
    @syockit If you disable paging, you can't run any programs. Paging is the mechanism by which files are read in when mapped into memory. Commented Nov 16, 2011 at 16:39
  • 3
    @psusi : Clean pages will not be reduced to a minimum when you have swap. It will instead swap out dirty, anonymous pages that haven't been recently used. Of course, either way you'll get violent thrashing eventually if the working set exceeds physical memory. The point is, with or without swap, you will get lots of violent thrashing before you actually run out of memory. The difference is, with swap the violent thrashing will be swapping (dirty pages, write and read). Without swap, the violent thrashing will be code faults (clean pages, read only). Commented Nov 28, 2011 at 4:57
  • 2
    You're missing the point; this will only happen without swap in a vary narrow sweet spot where 99% of memory is allocated. As soon as it hits 100% ( which likely happens pretty fast ), then the run away process is killed, freeing up lots of memory. With swap, you thrash heavily for a very long time before you exhaust both ram and swap and only then is the process killed.
    – psusi
    Commented Nov 28, 2011 at 15:17
  • 3
    @DavidSchwartz, the narrow sweet spot is the 95%+ usage window. A runaway process will quickly grow to 100% and be killed. So yes, you will purge your disk cache, but the runaway process is killed quickly and the system returns to having plenty of free memory. This is much better than when you have swap enabled, in which case, the system runs at 95% and keeps moving more and more out to swap, hammering away at the disk the whole time, and only gives up and kills the run away process once swap is also exhausted.
    – psusi
    Commented Feb 2, 2012 at 23:27
  • 2
    @psusi: You are correct if the concern is a runaway process that rapidly blows up in memory consumption. But that's not what the OP is talking about, which is a process that consumes excessive, but not unbounded or massively excessive, memory. As it grows through the large sweet spot (where the cache is squeezed) it will grow more and more slowly as the system thrashes. Commented Feb 2, 2012 at 23:44
15

A better solution than turning off swap, which will at best cause random processes to be killed when memory runs low, is to set the per process data segment limit for processes that pull stuff off the net. This way a runaway browser will hit the limit and die, rather than cause the whole system to become unusable. Example, from the shell

(ulimit -d 400000; firefox) &

The number after -d is in kilobytes. You should experiment with this on your system to choose the best value for your browsing habits. The parentheses cause a subshell to be created; the ulimit command only affects that shell and its children, isolating its effects from the parent shell.

5
  • Will this work for chromium, say, where we have a bunch of chromium processes using small chunks of memory?
    – jberryman
    Commented Jun 25, 2015 at 16:37
  • @jberryman No, the memory limits are per-process rather than per-user.
    – Kyle Jones
    Commented Jun 26, 2015 at 14:36
  • Is there a way to send it a specified signal (e.g., SIGHUP) when it reaches the memory limit?
    – Geremia
    Commented Feb 24, 2017 at 16:16
  • 1
    @Geremia No. The brk and sbrk system calls stop working, which will make most things curl up and die.
    – Kyle Jones
    Commented Feb 24, 2017 at 16:23
  • 1
    If you want to go with manual tuning, I would suggest using memory cgroup instead of ulimit because with memory cgroup you can set limit for the whole process group and can configure the memory allocating process to stop and your user mode policy process can decide what to do (e.g. send some signals, select process to be killed, raise the memory limit on the fly). See kernel.org/doc/Documentation/cgroup-v1/memory.txt and kernel.org/doc/Documentation/cgroup-v2.txt for details. Commented Jan 8, 2019 at 13:04
5

To make sure that swap is not used, you'd be better off preventing any swap being added at boot. This can be done, depending on the system, by disabling the swap boot service or just commenting out the swap entry in /etc/fstab.

As far as your hangup is concerned, the stop() function in /etc/init.d/swap might give a clue:

stop()
{
       ebegin "Deactivating swap devices"

       # Try to unmount all tmpfs filesystems not in use, else a deadlock may
       # occure. As $RC_SVCDIR may also be tmpfs we cd to it to lock it
       cd "$RC_SVCDIR"
       umount -a -t tmpfs 2>/dev/null

       case "$RC_UNAME" in
               NetBSD|OpenBSD) swapctl -U -t noblk >/dev/null;;
               *)              swapoff -a >/dev/null;;
       esac
       eend 0
}

Notice the part about deadlock. You can try doing umount -a -t tmpfs yourself before turning swap off.


Edit:

Probably, you might also achieve your goal by modifying sysctl settings (see this question).

5
  • I don't have swap in init.d, nor do I have it on fstab, but I do have /etc/init.d/mountoverflowtmp that mounts tmpfs for emergency log writes. Does the swap daemon use tmpfs too?
    – syockit
    Commented Nov 15, 2011 at 11:58
  • You might have it enabled elsewhere - do grep -RF swap /etc/ if you wish to find it. But to disable a service, you'd use a command like service (IIRC; I don't use Debian myself). Commented Nov 15, 2011 at 12:02
  • 1
    Swap itself does not use tmpfs, because tmpfs is an in-memory (RAM) filesystem. But other services/programs that use tmpfs might rely on swap in a special manner. I don't really know, but it might have something to do with caching or a special way in which tmpfs driver claims access to swap space. Commented Nov 15, 2011 at 12:04
  • There's something about how Linux handles virtual memory that I don't understand. I've disabled swap in most ways possible: via swapoff, and via vm.swappiness=0. Yet kswapd0 still runs! I wonder if this is a regression from the 2.4 days…
    – syockit
    Commented Nov 15, 2011 at 15:36
  • 5
    @syockit It's expected behavior. The system is still swapping clean pages (pages that contain copies of file data). It requires no swap space to swap clean pages, since they can be read back from sources other than swap. Commented Nov 15, 2011 at 16:58
4

It is better to comment out swap partition entry in /etc/fstab than running swapoff -a after each boot.

I have the same issue with kswapd0 on my hardware.

Tuning vm.swappiness system parameter does not help for me.

sysctl -w vm.swappiness=0

I googled and read a lot of posts, mailing lists, and now I think that this is kernel bug.

When there is no active swap partition and free memory becomes less then some threshold (about 300MB in my case) the system becomes unresponsive due to kswapd0 madness.

Probably it is reproduced with special configuration and conditions.

For somebody it is solved by system re-installation with re-partitioning for others by building custom kernel with kswapd0 disabled.

1
  • 2
    If kswapd0 goes mad and you don't have swap activated you're out of RAM. Your choices are OOM Killer or kswapd0. Linux goes with kswapd0 because the kernel assumes that it's more important to finish slowly than to abort the process. For casual humans, the threshold where kernel thinks that enough forward progress still does happen is already glacially slow and nearly anybody would rather select OOM Killer. Commented Aug 31, 2018 at 10:35
4

On my system (debian sid 2016-11-15), I did this:

  1. disable the swap now:

    swapoff -a
    
  2. comment the line with swap partition in /etc/fstab (you may not need this, maybe only step 3 without step 2 will work for you)

    #### #UUID=c6ddbc95-3bb5-49e1-ab25-b1c505e5360c none            swap    sw              0       0
    
  3. disable the mounting of swap in systemd (Note, wrap the unit name in quotes in case the unit name has backslash characters):

    systemctl --type swap
    systemctl stop "dev-\x2da821.swap"
    systemctl mask "dev-\x2da821.swap"
    

That seems to do the trick.

1

the computer hangs with violent I/O throughput in the background. iotop reveals kswapd0 to be the culprit

I've found one way (so far) to avoid that. If you want to test it and see how it does on your system, see the kernel patch inside this question. Basically, it doesn't evict Active(file) pages (at least) when under memory pressure, thus the disk thrashing (constant reading) is reduced to almost nothing and OOM-killer is allowed to trigger within 1 second, instead of freezing the OS for what seems like permanently(or at least for many minutes). I am hoping that actual programmers(of which I'm not) would improve the patch and make it into an actual solution, now that they see that what it does is working for these situations.

2
  • is this kernel patch mainlined already? Commented Nov 23, 2018 at 18:22
  • @humanityANDpeace probably not, because it's not that good(as I am not a programmer), however I did run into some issue with it, such as: sometimes, depending on workload, with this patch, you can run out of memory in cases in which without this patch you wouldn't have and thus OOM-killer will kill Xorg and xfwm4, UNLESS I run echo 1 | sudo tee /proc/sys/vm/drop_caches when Active(file): (of /proc/meminfo) is over 2GB (on a 16G RAM system) -it can go to max 4G
    – user306023
    Commented Nov 28, 2018 at 19:13

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .