I read a bunch of stuff about "Write-back" mechanism today and wanted to try an experiment.
EDIT: file downloaded from the Internet shows up in Dirty
stats
Based on the comments, I tried to download a file from the Internet to see if this is "problem" with dd
.
Indeed, it seems that a file downloaded via wget
changes Dirty
stats in /proc/meminfo
.
Dirty
size changes from a few hundreds KBs to 5 MB after the file is downloaded. It stays that way for at least 10 seconds (sometimes 20+ seconds) which is still interesting since I'd expect that interval to be less than 5 seconds (based on vm.dirty_writeback_centisecs
).
Writeback
(from what I see) stays 0 kB.
Run the watch:
Every 1.0s: grep -e Dirty -e Writeback /proc/meminfo ubuntu-18: Tue Apr 20 13:32:31 2021
Dirty: 5044 kB
Writeback: 0 kB
WritebackTmp: 0 kB
Download the file:
root@ubuntu-18:~# wget https://www.stats.govt.nz/assets/Uploads/Annual-enterprise-survey/Annual-enterprise-survey-2019-financial-year-provisional/Download-data/annual-enterprise-survey-2019-financial-year-provisional-csv.csv
...
2021-04-20 13:27:11 (2.03 MB/s) - ‘annual-enterprise-survey-2019-financial-year-provisional-csv.csv’ saved [5134576/5134576]
Files generated with dd
- no impact on Dirty
stats unless the file is really big
I generated 10MB and 100MB files with dd
and watched /proc/meminfo
.
But I haven't noticed any significant increase in Dirty
memory:
dd if=/dev/zero of=file.txt count=100 bs=1M
watching /proc/meminfo while before, during, and after dd
is executed
Every 1.0s: grep -e Dirty -e Writeback /proc/meminfo ubuntu-18: Tue Apr 20 08:06:35 2021
Dirty: 280 kB
Writeback: 0 kB
WritebackTmp: 0 kB
Is there any reason for this behavior?
I thought that, according to these settings (vm.dirty_writeback_centisecs
), it should take up to 5 seconds before the Dirty memory is written back to the disk:
sysctl -a | grep dirty
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 20
vm.dirty_writeback_centisecs = 500
vm.dirtytime_expire_seconds = 43200
Now, when I try to generate a 1GB file then I see an increase in Dirty
memory size but only while dd
is running. It drops instantly to a very small value after dd
completes its job.
I tried with both the smaller and the larger file size multiple times.
I'm running all of this on a Hetzner Cloud VM machine with Ubuntu:
root@ubuntu-18:~# uname -a
Linux ubuntu-18 5.4.0-40-generic #44-Ubuntu SMP Tue Jun 23 00:01:04 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
root@ubuntu-18:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.2 LTS
Release: 20.04
Codename: focal
Resources
Resources I checked briefly (maybe some of them have an explanation which I have missed or didn't understood):
- https://www.thomas-krenn.com/en/wiki/Linux_Page_Cache_Basics
- https://ncona.com/2018/05/linux-page-cache/
- https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
- Writeback cache (`dirty`) seems to be limited to even less than dirty_background_ratio. What is it being limited by? How is this limit calculated?
- Can I watch the progress of a `sync` operation?
- https://superuser.com/questions/479379/how-long-can-file-system-writes-be-cached-with-ext4
dd
, then I trieddd if=/dev/urandom
. Pretty much the same behavior (although the file generation was noticeably slower). – Juraj Martinka Apr 20 '21 at 08:45dd
is optimized (just using a limited buffer, and so we havebs=
part, possibly without memory allocation). – Giacomo Catenazzi Apr 20 '21 at 09:03dd
. I tried to download a CSV file withwget
and got some interesting stats from meminfo - I'll update the question. – Juraj Martinka Apr 20 '21 at 11:28