1

Motivation

I wrote an answer here: How large are the "watermark" memory reservations on my system? The "min" watermark for the "Normal" zone appeared as 31449 pages. This is 125796 KiB - larger than my entire min_free_kbytes (67584).

Resetting min_free_kbytes sets the min watermark of this zone to an expected level (e.g. 9582 pages). But after a while, it goes back to the higher level.

I am confident this is due to boost_watermark(). It boosts "min", "low", and "high" watermarks by the same amount. watermark_boost_factor is 15000, so the maximum boost should be 150% of the original "high" watermark...

Question

Why is my "high" watermark so high in the first place? (and also the "low" watermark):

Since my watermark_scale_factor is 10, the distance between "min", "low", and "high" is supposed to be only 0.1% of the zone size. But if I look immediately after resetting min_free_kbytes, the difference between "min" and "low" is 2% of the zone size. Why?

(Also the difference between "low" and "high" is 0.2% of the zone size. So this is not what we expect either!).

The code that I thought sets up the watermarks is in __setup_per_zone_wmarks().

Kernel version: 5.0.17-200.fc29.x86_64

From /proc/zoneinfo:

Node 0, zone   Normal
  pages free     74597
        min      9582
        low      34505
        high     36900
        spanned  1173504
        present  1173504
        managed  1140349

I don't see this massive discrepancy in the DMA32 zone. It doen't look like the "min" watermark get boosted in the DMA32 zone either, maybe because the kernel prefers to allocate from the "Normal" zone.

Node 0, zone      DMA
...
  pages free     3961
        min      33
        low      41
        high     49
        spanned  4095
        present  3996
        managed  3961
...
Node 0, zone    DMA32
  pages free     334671
        min      7280
        low      9100
        high     10920
        spanned  1044480
        present  888973
        managed  866356
sourcejedi
  • 50,249

1 Answers1

1

I worked out why the distance between watermarks did not match the 0.1% number.

On "small systems", the distance between the watermarks is one quarter of the (un-boosted) "min" watermark. I.e. the documented distance managed * watermark_scale_factor / 10000 is not used, if it is smaller than min / 4 (for a given zone).

    } else {
        /*
         * If it's a lowmem zone, reserve a number of pages
         * proportionate to the zone's size.
         */
        zone->_watermark[WMARK_MIN] = tmp;
    }

    /*
     * Set the kswapd watermarks distance according to the
     * scale factor in proportion to available memory, but
     * ensure a minimum size on small systems.
     */
    tmp = max_t(u64, tmp >> 2,
            mult_frac(zone_managed_pages(zone),
                  watermark_scale_factor, 10000));

    zone->_watermark[WMARK_LOW]  = min_wmark_pages(zone) + tmp;
    zone->_watermark[WMARK_HIGH] = min_wmark_pages(zone) + tmp * 2;
    zone->watermark_boost = 0;

tmp >> 2 is equivalent to tmp / 4.

Source code link: linux-5.0.17/mm/page_alloc.c:7531

I also noticed there was a recent bug here. There is not supposed to be a difference between "high - low" and "low - min" ! This can happen because min_wmark_pages(zone) depends on zone->watermark_boost having been set, but it is called before that happens. I have reported the bug to the maintainers.

sourcejedi
  • 50,249