1

I'm just getting my feet wet with Alpine Linux (v3.13.2) running as a VM. (On Hyper-V, but I don't think that should matter here.) I'm wondering why, after install, Alpine Linux appears to only use the expected "0-700MB", but during install it writes enough that the .vhdx file grows out to almost 4GiB?

Story

I downloaded and used the "virtual" .iso to install into a new, blank VM. I created a single 127GiB, dynamically-expanding vhdx for the drive; it starts life as a 4MiB .vhdx file, like always. The VM was given 1GiB RAM, 4 vCPU, and a NIC. During setup-alpine, I chose defaults almost exclusively ('cept the keyboard, and timezone, IIRC), but I installed in "sys" mode to "sda".

At the point where it actually makes the filesystem, that .vhdx file grows from 4MiB to ~1.7GiB at one point, then again up to ~3.5GiB by the time I'm told to reboot.

After dismounting the .iso, rebooting and logging in, I'm finally looking at a 3.72GiB .vhdx file. But df shows I'm not using nearly that much:

# df
Filesystem     1K-blocks     Used Available Use% Mounted on
devtmpfs           10240        0     10240   0% /dev
shm               505164        0    505164   0% /dev/shm
/dev/sda3      128048328   189800 121310972   0% /
tmpfs             101036      108    100928   0% /run
/dev/sda1         523248      272    522976   0% /boot/efi

It's not all contiguous "zeros", because I tried to do my usual Optimize-Vhd routine (Powershell), and it didn't reduce the size of the .vhdx. I'm expecting that I'll have to figure out how to defrag from within the Alpine Linux instance before Optimize-Vhd would shrink the .vhdx file.

Thoughts?

Granger
  • 123
  • 1
    What is Optimize-Vhd? Some sort of windows tool? Does this mean your host OS is Windows? Defragmenting is unlikely to be relevant, that isn't really an issue on the filesystems Linux uses. What filesystem is this? ext4? Please [edit] your question and add this information. Also, please avoid posting images of text. I'd like to copy the numbers, for example, to add them up and see what is going on but I can't do that easily since you have put them in an image. – terdon Feb 20 '21 at 15:49
  • The installation probably downloads packages, and unpacks them in /tmp or /var before installing them. After install, the content is "deleted", but the damage has been done and the virtual disk expanded because it was used. You might consider mounting /var to another virtual disk during installation, then move its contents to your main disk post installation. Another option is figuring out where this unpacking directory is (/var/lib/apk/cache or something like that) and mounting that as tmpfs or to another virtual disk which can be discarded later. – Stewart Feb 20 '21 at 16:55
  • 1
    @terdon - Umm... Powershell. I don't know what FS Alpine Linux uses. Both aren't critical to the question, if you don't know. And I wasn't able to copy paste as wanted, but I didn't think it was bad since you don't need a calculator to tell what's going on. In any case Andy was kind enough to fix it. – Granger Feb 20 '21 at 18:06
  • @Stewart - Thanks. Seems odd to use that much, even temporarily, but when I get time to try next, I'll use the "extended" ISO. But at this point, I'm guessing I'll look into how to "zero" the extra space, and what the installer is doing. At the end of the day, if I have to give up, 4GiB is still better than a 25GiB Windows 2019 Server install. :) – Granger Feb 20 '21 at 18:11
  • I'd echo Stewart's comments. I'm curious to know if fstrim has any positive effect? File systems famously do not zero deleted files, they just sit there until overwritten. 3.5gb does sound a lot but might be explained by a lot of smaller temporary files occupying an entire block each. – Philip Couling Feb 20 '21 at 22:28
  • As a different example, XFS preallocates blocks when it thinks they might be used in the future. Something similar might be at work here. I have seen qcow2 files of KVM virtual machines getting larger because of this behaviour. Difference: When XFS notices that those blocks are not required, it releases them, which then also shrinks the qcow2 file. – berndbausch Feb 21 '21 at 03:01
  • @PhilipCouling - fstrim had no effect. But I did note that the .vhdx increases in size ~30MiB every time it boots. – Granger Feb 22 '21 at 15:56
  • 1
    Typo, perhaps? "800GiB" – Chris Davies Feb 22 '21 at 20:24

1 Answers1

0

It is not something exclusive to Alpine Linux. It's a factor of the options/behavior of the file-system (ext4) and the difference in how Windows works with sparse files.

Linux relies on sparse files in a way that Windows does not, meaning that it will often cause expansion of a great many blocks that are almost completely empty. I see from your readout that you used the default block size of 32MB, which commonly causes the common filesystems used by Linux to bloat out. Not all Linux filesystems behave that way, so it is possible that your disk naturally expanded.

Post on a Technet thread by Eric Siron (on Tuesday, January 22, 2019 5:46 PM). He has additional articles about this, talking about when you may or may not want to worry about it and how you might deal with it if you do.

I took @Stewart's advice to try fstrim -v /, but while it reported that it trimmed ~122GiB, it didn't actually change things such that Optimize-Vhd shrunk the size of the .vhdx file more than ~50MiB.

Based on another line of advice and notes in a Debian wiki, there's a lot of disagreement/confusion about what the discard option actually does and whether or not it's a good or bad thing. I bet it's partially related to the difference between a virtual and physical disk. Since fstrim didn't help, I made a fresh install and changed /etc/fstab to use the discard option (and rebooted) before I tried fstrim. It didn't make any difference for this situation.

From this question,

...the fs is for a VM and is contained in a sparse raw image file on the host. If new blocks are allocated, the image file gradually loses it's sparseness over time as files are created/deleted/modified, tending towards the 'non-sparse' size, even if the total storage used on the VM remains basically constant.

it appears the problem is actually more of the file-system's reticence to re-use deleted blocks. If that's true, then I would expect that this growth would stop at a percentage of the overall drive/vhdx's size. This makes sense with the behavior I've seen (The .vhdx file increases in size by ~30MiB with every boot---meaning: Boot, login, poweroff. Lather, rinse, repeat; +~30MiB each boot.). In the end, I expect it's a combination of that behavior and how ext4 does sparse files as opposed to how the Vhdx driver does it.

Finally, I found Microsoft's best-practices for Linux on Hyper-V. Simply using a vhdx that had a block-size of 1MB, resulted in a fresh install with ~286MiB .vhdx! I haven't bothered to see if adding flex_bg=4096 to the mount options in /etc/fstab will make any additional difference.

Granger
  • 123