Why Linux does not clean up disk caches and swap automatically?

Question

For example, when I archive a few gigs of files (using tar), Linux uses quite a lot of disk caching (and some swap) but never cleans it up when the operation has completed. As a result, because there's no free memory Linux will try to swap out something from memory which in its turn creates an additional load on CPU.

Of course, I can clean up caches by running echo 1 > /proc/sys/vm/drop_caches but isn't that stupid that I have to do that?

Even worse with swap, there's no command to clean up unused swap, I have to disable/enable it completely which I don't think is a safe thing to do at all.

UPD:

I've run a few tests and found out a few things:

The swapped out memory pages during the archive command not related to archived files, it seems it's just a usual swapping out process caused by decreased free memory (because disk caching ate it all) according to swappiness
Running swapoff -a is actually safe, meaning swapped pages will move back to memory

My current solution is to limit archive command memory usage via cgroups (I run docker container with -m flag). If you don't use docker, there's a project https://github.com/Feh/nocache that might help.

The remaining question is when will Linux clean up disk caching and will it at all? If not, is it a good practice to manually clean up disk cache (echo 1 > /proc/sys/vm/drop_caches)?

Nice hack with docker! If you don't specifically want to run inside docker, you could also use a cgroup manually. systemd-run should be able to apply systemd.resource-control properties to any command (these are also implemented using cgroups). As a non-root user, you can use systemd-run --user — sourcejedi, Jun 22 '18 at 12:18
You are misunderstanding memory reports from the system. Pages used as cache are free memory. It's only used as cache until some process wants it. Basically that's the OS using spare memory to cache stuff. Dropping cached content does no good (that memory was already available). Worse, if any of the dropped data is needed, the system will have to fetch it again. — spectras, Aug 26 '18 at 03:53
@spectras I know that very well, it does not change the fact that Linux tries to swap out memory because of low "free" memorey — chingis, Aug 27 '18 at 11:48
It does not. It optimizes memory use, usually ahead of time. A frequently-used cache is much more useful than a page that was really allocated, but it was 20 days ago and the process never used it since. The aggressiveness of that optimization is controlled by the vm.swappiness setting, as dolapevich said in his answer. You can disable it entirely by setting it to 0. — spectras, Aug 27 '18 at 13:41
@spectras I don't want to disable swappiness completely, I need it. My servers get a CPU spikes because of swapping out because of low "free" cache — chingis, Aug 27 '18 at 13:53
@spectras if someone has any fragment of information about Linux copying or moving pages to swap "ahead of time", I would appreciate their citation. So far I've seen no reference for how this hypothetical behaviour is calculated, or configured, nor any references to kernel code or statements by kernel developers. I have a paragraph on it in this question and would welcome comments: atop shows swout (swapping) when I have gigabytes of free memory. Why? — sourcejedi, Aug 02 '19 at 13:51
@sourcejedi look up the documentation of vm.swappiness setting. It's nothing complex actually, it's a threshold that triggers preventive caching before it's actually needed. For instance if it's 60, linux will start looking for pages to swap out when memory use goes above 40%. As for the kernel, just read the source yourself, relevant code is in mm/vmscan.c. — spectras, Aug 02 '19 at 23:12
@sourcejedi sorry, but I have better things to do. Go read the source for accurate info. — spectras, Aug 04 '19 at 13:27
@spectras the answer does that as well, it just seemed better to start it with the published documentation because that's easier for people to read (and introduces necessary concepts). — sourcejedi, Aug 04 '19 at 13:40
@sourcejedi I read it, and as someone put it correctly, you answer no then proceed to show that yes, it does. You just disagree on the definition of the word "opportunistic". I am not interested in spending time on debating whether the preliminary thresholds strategy fulfills your definition of the word “opportunistic”. — spectras, Aug 04 '19 at 14:32
@spectras if you take the example in this question, we may fill ram by reading at disk speeds. E.g. with a 100MB/s disk, 1GB of ram, and a difference between the high watermark and min of 1%, expressing that in terms of a time could give you 0.1 seconds. So I assume "ahead of time" refers to more than that timescale. Instead you're talking about the strategy of balancing reclaim of swappable memory v.s. file-backed memory. As in the question, the lore says "opportunistic swapping" happens "when the system is idle". I assume you at least disagree with that. kswapd runs at default priority. — sourcejedi, Aug 04 '19 at 15:28

sourcejedi · Accepted Answer · 2019-08-04T14:54:41.907

Nitpick: the CPU time used by swapping is not usually significant. When the system is slow to respond during swapping, the usual problem is the disk time.

(1) Even worse with swap, there's no command to clean up unused swap

Disabling and then enabling swap is a valid and safe technique, if you want to trigger and wait for the swapped memory to be read back in. I just want to say "clean up unused swap" is not the right description - it's not something you would ever need to do.

The swap usage might look higher than you expected, but that does not mean it is not being used. A page of memory can be stored in both RAM and swap at the same time. There is a good reason for this.

When a swap page is read back in, it is not specifically erased, and it is still kept track of. This means if the page needs to be swapped out again, and it has not changed since it was written to swap, the page does not have to be written again.

This is also explained at linux-tutorial.info: Memory Management - The Swap Cache

If the page in memory is changed or freed, the copy of the page in swap space will be freed automatically.

If your system has relatively limited swap space and a lot of RAM, it might need to remove the page from swap space at some point. This happens automatically. (Kernel code: linux-5.0/mm/swap.c:800)

(2) The remaining question is when will Linux clean up disk caching and will it at all? If not, is it a good practice to manually clean up disk cache (echo 1 > /proc/sys/vm/drop_caches)?

Linux cleans up disk cache on demand. Inactive disk cache pages will be evicted when memory is needed.

If you change the value of /proc/sys/vm/swappiness, you can alter the bias between reclaiming inactive file cache, and reclaiming inactive "anonymous" (swap-backed) program memory. The default is already biased against swapping. If you want to, you can experiment with tuning down the swappiness value further on your system. If you want to think more about what swappiness does, here's an example where it might be desirable to turn it up: Make or force tmpfs to swap before the file cache

Since Linux cleans up disk cache on demand, it is not generally recommended to use drop_caches. It is mostly for testing purposes. As per the official documentation:

This file is not a means to control the growth of the various kernel caches (inodes, dentries, pagecache, etc...) These objects are automatically reclaimed by the kernel when memory is needed elsewhere on the system.

Use of this file can cause performance problems. Since it discards cached objects, it may cost a significant amount of I/O and CPU to recreate the dropped objects, especially if they were under heavy use. Because of this, use outside of a testing or debugging environment is not recommended.

score 2 · Answer 2 · edited Aug 02 '19 at 13:37

2

This is the expected behavior. You can adjust swap usage using the vm.swappiness sysctl and tune it according to your needs.

edited Aug 02 '19 at 13:37

sourcejedi

50,249

answered Jun 22 '18 at 11:37

Dolapevich

154

2

Also, notice that we WANT to use memory. Do not be fooled by the "Free memory" count. That is memory the system is not using and is wasted. You do want to use memory and keep it full of hot objects ready to be retrieved. – Dolapevich Jun 24 '18 at 17:32

Why Linux does not clean up disk caches and swap automatically?

2 Answers2

Linked