0

I read a bunch of stuff about "Write-back" mechanism today and wanted to try an experiment.

EDIT: file downloaded from the Internet shows up in Dirty stats

Based on the comments, I tried to download a file from the Internet to see if this is "problem" with dd.

Indeed, it seems that a file downloaded via wget changes Dirty stats in /proc/meminfo.

Dirty size changes from a few hundreds KBs to 5 MB after the file is downloaded. It stays that way for at least 10 seconds (sometimes 20+ seconds) which is still interesting since I'd expect that interval to be less than 5 seconds (based on vm.dirty_writeback_centisecs). Writeback (from what I see) stays 0 kB.

Run the watch:

Every 1.0s: grep -e Dirty -e Writeback /proc/meminfo                                                                                                                                                                                                                                     ubuntu-18: Tue Apr 20 13:32:31 2021

Dirty: 5044 kB Writeback: 0 kB WritebackTmp: 0 kB

Download the file:

root@ubuntu-18:~# wget https://www.stats.govt.nz/assets/Uploads/Annual-enterprise-survey/Annual-enterprise-survey-2019-financial-year-provisional/Download-data/annual-enterprise-survey-2019-financial-year-provisional-csv.csv
...
2021-04-20 13:27:11 (2.03 MB/s) - ‘annual-enterprise-survey-2019-financial-year-provisional-csv.csv’ saved [5134576/5134576]

Files generated with dd - no impact on Dirty stats unless the file is really big

I generated 10MB and 100MB files with dd and watched /proc/meminfo. But I haven't noticed any significant increase in Dirty memory:

dd if=/dev/zero of=file.txt count=100 bs=1M

watching /proc/meminfo while before, during, and after dd is executed

Every 1.0s: grep -e Dirty -e Writeback /proc/meminfo ubuntu-18: Tue Apr 20 08:06:35 2021

Dirty: 280 kB Writeback: 0 kB WritebackTmp: 0 kB

Is there any reason for this behavior? I thought that, according to these settings (vm.dirty_writeback_centisecs), it should take up to 5 seconds before the Dirty memory is written back to the disk:

 sysctl -a | grep dirty
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 20
vm.dirty_writeback_centisecs = 500
vm.dirtytime_expire_seconds = 43200

Now, when I try to generate a 1GB file then I see an increase in Dirty memory size but only while dd is running. It drops instantly to a very small value after dd completes its job.

I tried with both the smaller and the larger file size multiple times.

I'm running all of this on a Hetzner Cloud VM machine with Ubuntu:

root@ubuntu-18:~# uname -a
Linux ubuntu-18 5.4.0-40-generic #44-Ubuntu SMP Tue Jun 23 00:01:04 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

root@ubuntu-18:~# lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.2 LTS Release: 20.04 Codename: focal

Resources

Resources I checked briefly (maybe some of them have an explanation which I have missed or didn't understood):

  • Try with random. IIRC (but I'm not an expert) zero pages are handled differently (and possibly not allocated, but when modified). – Giacomo Catenazzi Apr 20 '21 at 08:36
  • If you meant for dd, then I tried dd if=/dev/urandom. Pretty much the same behavior (although the file generation was noticeably slower). – Juraj Martinka Apr 20 '21 at 08:45
  • So I have no idea. Probably dd is optimized (just using a limited buffer, and so we have bs= part, possibly without memory allocation). – Giacomo Catenazzi Apr 20 '21 at 09:03
  • You might be right about dd. I tried to download a CSV file with wget and got some interesting stats from meminfo - I'll update the question. – Juraj Martinka Apr 20 '21 at 11:28

0 Answers0