Why is dd so slow with a bs of 100M

Question

I just tried to overwrite a fast ssd using dd. Using the ubuntu boot image I typed in:

dd if=/dev/zero of=/dev/sda bs=100M
error writing '/dev/sda': No space left on device
blah blah
256 GB copied, 1195.81 s 214 MB/s

Isn't that quite slow? And where is the bottleneck? What about the choice of block size?

Isn't that slow? Well that depends entirely on how fast this drive is supposed to be, which we have no idea. — psusi, Feb 10 '15 at 13:58
Also you shouldn't be putting wear and tear on the drive and filling all of its erase blocks by writing it all with zeros... if you want to erase the drive, use hdparm --security-erase to direct the drive to wipe itself. — psusi, Feb 10 '15 at 14:00

frostschutz · Accepted Answer · 2018-02-08T20:27:18.797

Optimal blocksizes for dd are around 64k-256k, humans usually prefer 1M.

A benchmark without real I/O:

$ for bs in 512 4k 16k 64k 128k 256k 512k 1M 4M 16M 64M 128M 256M 512M
> do
>     echo ---- $bs: ----
>     dd bs=$bs if=/dev/zero of=/dev/null iflag=count_bytes count=10000M
> done
---- 512: ----
20480000+0 records in
20480000+0 records out
10485760000 bytes (10 GB) copied, 4.2422 s, 2.5 GB/s
---- 4k: ----
2560000+0 records in
2560000+0 records out
10485760000 bytes (10 GB) copied, 0.843686 s, 12.4 GB/s
---- 16k: ----
640000+0 records in
640000+0 records out
10485760000 bytes (10 GB) copied, 0.533373 s, 19.7 GB/s
---- 64k: ----
160000+0 records in
160000+0 records out
10485760000 bytes (10 GB) copied, 0.480879 s, 21.8 GB/s
---- 128k: ----
80000+0 records in
80000+0 records out
10485760000 bytes (10 GB) copied, 0.464556 s, 22.6 GB/s
---- 256k: ----
40000+0 records in
40000+0 records out
10485760000 bytes (10 GB) copied, 0.48516 s, 21.6 GB/s
---- 512k: ----
20000+0 records in
20000+0 records out
10485760000 bytes (10 GB) copied, 0.495087 s, 21.2 GB/s
---- 1M: ----
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 0.494201 s, 21.2 GB/s
---- 4M: ----
2500+0 records in
2500+0 records out
10485760000 bytes (10 GB) copied, 0.496309 s, 21.1 GB/s
---- 16M: ----
625+0 records in
625+0 records out
10485760000 bytes (10 GB) copied, 0.972703 s, 10.8 GB/s
---- 64M: ----
156+1 records in
156+1 records out
10485760000 bytes (10 GB) copied, 1.0409 s, 10.1 GB/s
---- 128M: ----
78+1 records in
78+1 records out
10485760000 bytes (10 GB) copied, 1.04533 s, 10.0 GB/s
---- 256M: ----
39+1 records in
39+1 records out
10485760000 bytes (10 GB) copied, 1.04685 s, 10.0 GB/s
---- 512M: ----
19+1 records in
19+1 records out
10485760000 bytes (10 GB) copied, 1.0436 s, 10.0 GB/s

The default 512 bytes is slow like hell (two syscalls per 512 bytes is just too much for the CPU)
4k is considerably better than 512
16k is considerably better than 4k
64k-256k is about as good as it gets
512k-4M slightly slower
16M-512M speed cuts in half, worse than 4k.

My guess is that starting with a certain size, you start losing speed due to lack of concurrency. dd is a single process; concurrency is largely provided by the kernel (readahead, cached write, ...). If it has to read 100M before it can write 100M, there will be moments when a device sits idle, waiting for the other to finish reading or writing. Too small blocksize and you suffer from sheer syscall overhead, but that goes away completely with 64k or so.

100M or larger blocksizes might help when copying from and to the same device. At least for hard drives, doing so should reduce the time wasted on seeking, as it can't be in two places simultaneously.

Why are you overwriting your SSD like this in the first place? Normally, you try to avoid unnecessary writes on SSDs; if it considers all of its space used, it will likely also lose some of its performance until you TRIM it free again.

You could use this command instead to TRIM/discard your entire SSD:

blkdiscard /dev/sda

If your SSD has deterministic read zeroes after TRIM (a property you can check with hdparm -I) it will look like it's full of zeroes, but the SSD actually considers all of its blocks as free which should give you the best possible performance.

The downside of TRIM is that you lose all chances at data recovery if the deleted file has already been discarded...

Why? I want to sell my old computer and erase everything therefore. I thought overwriting everything with 0 would be the easiest way to do it. — Nils, Feb 10 '15 at 08:58
Thx I did not know about blkdiscard, but the acutal answer is in the comments above. — Nils, Feb 10 '15 at 08:59
@Nils physical overwriting should be done with random data, shred -v -n 1. 0 is only good if it's actually written, and there is no guarantee for that anymore depending on how smart your medium is [0 can be compressed, optimized, marked-as-free instead, so it might not actually overwrite much]. For home use, trimming should be good enough, though. If the SSD actually deletes its cells (unfortunately hard to verify from the outside) there is little chance of recovery. — frostschutz, Feb 10 '15 at 10:45
Make a benchmark with real I/O. You'll probably get different numbers (that one was with a HD, not with an SSD). With actual I/O on typical desktop machines I'd expect the optimal size to be several megabytes, 100kB is definitely on the small size. — Gilles 'SO- stop being evil', Mar 04 '15 at 23:26
@Gilles everyone has to run their own benchmarks, of course. Same disk copy might be a corner case where larger blocksizes help. Otherwise, they usually don't. Which seems to be what your benchmark was showing... — frostschutz, Mar 05 '15 at 03:17
With USB stick attached to Thinkpad’s Superspeed 3.1 slot, I got 20 % higher throughput with bs=10M than with anything below BS=1M. — Smar, Aug 01 '20 at 05:37
Given how synthetic the benchmarks were here, I think it is unclear whether there are at all meaningful. — Keeley Hoek, Oct 28 '20 at 10:58
Maybe the jump from 4MB to 16MB is because you've exceeded the size of your L3 cache and so the data has to take a round trip to main memory and back again. What CPU did you do these tests on? — SurpriseDog, Mar 18 '21 at 05:09

Why is dd so slow with a bs of 100M

1 Answers1

Linked