independently verify that TRIM indeed works on SSD

Question

I have a LUKS partition /dev/sda1 which I luksOpen with --allow-discards:

cryptsetup --allow-discards luksOpen /dev/sda1 root

I then mount the ext4 filesystem with discard option:

grep /dev/mapper/root /proc/mounts
/dev/mapper/root / ext4 ro,relatime,block_validity,discard,delalloc,barrier,user_xattr,acl 0 0

I then trim free space on the mounted partition:

fstrim -v /

with df, I see / has 80% free space. That means that on /dev/sda1, 80% of the disk are binary zeros.

If I clone the image with cat

cat /dev/sda1 > sda1.img

and compress the image with xz, I would expect all the zeros on the disk to be compressed. Since the 20% of the data on the disk is encrypted, it should look like random and be uncompressible. Therefore, the xz-compressed image should be aprox. 20% of the raw size.

However, the resulting xz-compressed image is approximately same size as the raw original.

Is my reasoning correct?

Why does not my theory translate into practice ?

https://unix.stackexchange.com/a/85880/30851 and also dmsetup table | grep allow_discards — frostschutz, Jul 24 '17 at 17:59

fra-san · Answer 1 · 2019-02-06T15:29:13.387

Your logic is not incorrect. But it is only valid if some conditions are satisfied.

The TRIM command, as specified in the ATA command set, may or may not zero the sectors it is issued against.
Actually, the standard focuses on what data has to be returned after TRIM has been issued¹:

The follow behaviors are specified by this standard for sectors that the device trims (see 7.5.3.3):

a) non-deterministic - the data in response to a read from a trimmed sector may change for each read until the sector is written by the host;
b) Deterministic Read After Trim (DRAT) - the data returned in response to a read of a trimmed sector does not change, but may be different than the data that was previously returned; and
c) Read Zeroes After Trim (RZAT) - the data returned in response to a read of the trimmed sector is zero.

[...] For both DRAT and non-deterministic storage devices, the data returned in response to a read command to an LBA that has been successfully trimmed:

a) may be the previously returned data for the specified LBA;
b) may be a pattern generated by the storage device; and
c) is not data previously written to a different LBA by the host.

Thus, what your device returns after fstrim depends on the features it implements. Unless it supports RZAT, the assumption that data read from a trimmed device will be only zeros does not hold.

You can use hdparm to check for this:

sudo hdparm -I /dev/sdX | grep -i trim

I performed some tests using two SSDs, sda and sdb. Same manufacturer, different models, with different ATA conformance:

$ sudo hdparm -i /dev/sdb
 ...
 Drive conforms to: Unspecified:  ATA/ATAPI-3,4,5,6,7
 ...

$ sudo hdparm -i /dev/sda
 ...
 Drive conforms to: unknown:  ATA/ATAPI-2,3,4,5,6,7
 ...

The two SSDs have different support for TRIM:

$ sudo hdparm -I /dev/sda | grep -i trim
           *    Data Set Management TRIM supported (limit 1 block)

$ sudo hdparm -I /dev/sdb | grep -i trim
           *    Data Set Management TRIM supported (limit 8 blocks)
           *    Deterministic read ZEROs after TRIM

I can confirm that, after issuing fstrim, the drive supporting "Deterministic read ZEROs after TRIM" (RZAT) seems to have actually zeroed the concerned partition almost entirely. Conversely, the other drive seems to have zeroed (or otherwise replaced with some highly compressible pattern) only a minor part of the freed space.

¹ _{Online source: INCITS 529: Information technology - ATA/ATAPI Command Set - 4 (ACS-4)}

Note on testing:

As pointed out by frostschutz in comments, a read after fstrim may return data from the operating system cache, and not from the trimmed device. It is, for instance, what happened in this qustion.
(I would also point to this answer to the same question for an alternative method for testing TRIM).

Between fstrim and a subsequent read you may need to drop the cache, e.g. with:

echo 3 | sudo tee /proc/sys/vm/drop_caches

Depending on the size of the partition you are playing with, not dropping the cache may be enough for your tests to fail.

Note on your setup:

The discard mount option enables continuous TRIM, i.e. any time files are deleted. It is not required by fstrim. Indeed, on-demand TRIM and continuous TRIM are two distinct ways to menage TRIM operations. For further information I would point to Solid state drive on the Arch Linux Wiki, which has a detailed coverage of this matter.

Linux might also be returning non-zero data from its cache after TRIM, even though the SSD would re-read it as zeroes. This was a problem with my yes-trim-test over there https://unix.stackexchange.com/a/85880/30851 but might also be related to reading the raw data before and after TRIM. So if you don't get zero when you expect to, drop caches just in case. — frostschutz, Feb 06 '19 at 14:02
@frostschutz Good point! I was somehow assuming that, since the OP mentioned a "root" volume, it would have been too big for a significant part of it to fit in memory. But definitely cache happened to be in my way during my tests - that failed miserably until I started dropping it. I'll update my answer. — fra-san, Feb 06 '19 at 14:43
I think the Wiki article is wrong. It says for DRAT "may be different", but ATA spec page 82 says for DRAT and RZAT "shall cause deterministic". I mean it's not "must", but every software relies on a "must", else even Enterprise SSDs would brake the parity. Source: https://people.freebsd.org/~imp/asiabsdcon2015/works/d2161r5-ATAATAPI_Command_Set_-_3.pdf — mgutt, Feb 15 '21 at 09:57
@mgutt "the data returned in response to a read of a trimmed sector does not change, but may be different than the data that was previously returned" is not from Wikipedia - see footnote (1) in my answer. IMO, "does not change" (from that doc, page 621) corresponds to "the data in that logical block becomes determinate" from the page you quoted, while "may be different" corresponds to "with data set to any value". — fra-san, Feb 15 '21 at 10:21
Read page 622. It's only relevant for unexpected power-loss before flushing the data after a TRIM: "If pattern Y had been written to the media by the device, then a read of LBA 5 returns pattern Y." — mgutt, Feb 16 '21 at 07:46

score 2 · Answer 2 · answered Dec 16 '17 at 11:14

Does the SSD have a built-in hardware encryption layer? If it has one, then the TRIMmed blocks may be all-zeroes (or possibly all-ones) at the raw hardware level, but since the computer sees them through the encryption layer, they will appear as pseudo-random gibberish after passing the all-zeroes raw block through the decryption process.

Such a hardware encryption layer would have some advantages:

It would allow very fast security erase functionality: just have the drive destroy the original key used in the hardware encryption layer and replace it with a new one and all data will be instantly unrecoverable for most practical purposes.
As all the data hitting the raw hardware level would be encrypted, it would be guaranteed to look pseudo-random and thus be largely homogenous. This might help to avoid hot/cold spots and simplify wear estimation a lot.

score 0 · Answer 3 · answered Dec 16 '17 at 13:48

Discard is not the same as Zero.

If you want to zero with cryptsetup you could shrink the fs then the crypt block then dd the unused volume space.

If you want to know if trim worked doing a speed test should be an indicator after heavy use.

https://linux.die.net/man/8/fstrim https://en.m.wikipedia.org/wiki/Trim_(computing)

score 0 · Answer 4 · answered Feb 06 '19 at 10:07

0

df reporting free space does not imply zeroed space.

trim tells the storage device that the blocks are un-used. I don't think that this zeros them.

answered Feb 06 '19 at 10:07

ctrl-alt-delor

27,993

independently verify that TRIM indeed works on SSD

4 Answers4