2

According to most disk benchmarks, sequential write speeds are typically very close to the read speeds. Mine are in the ~500MB/s range.

$ dd if=/dev/zero of=tempfile bs=1M count=5120 conv=notrunc oflag=direct status=progress
5291114496 bytes (5,3 GB, 4,9 GiB) copied, 11 s, 481 MB/s
5120+0 records in
5120+0 records out
5368709120 bytes (5,4 GB, 5,0 GiB) copied, 11,1929 s, 480 MB/s

$ dd if=/dev/zero of=tempfile bs=1024M count=5 conv=notrunc oflag=direct status=progress
5368709120 bytes (5,4 GB, 5,0 GiB) copied, 11 s, 490 MB/s
5+0 records in
5+0 records out
5368709120 bytes (5,4 GB, 5,0 GiB) copied, 10,9524 s, 490 MB/s

As you can see, dd reports an avg over 480MB/s with bs=1M, and 490MB/s with bs=1024M. (Also, fio reports the read speeds as higher than dd by 20-30MB/s, which is interesting but not an issue for me)

$ fio --ioengine=libaio --size=1024m --filename=$HOME/tempfile --direct=1 --loops=5 --name=test --bs=1m --rw=write
....
    write: IOPS=146, BW=147MiB/s (154MB/s)(5120MiB/34894msec); 0 zone resets

$ fio --ioengine=libaio --size=1024m --filename=$HOME/tempfile --direct=1 --loops=5 --name=test --bs=1024m --rw=write
....
    write: IOPS=0, BW=144MiB/s (151MB/s)(5120MiB/35458msec); 0 zone resets

As you can see fio reports ~154MB/s instead with bs=1m, and 151MB/s with bs=1024m. (a surprisingly even lower value...)

What is causing fio to transfer files so slow and how can I configure it to write closer to the speed of dd?

(As a side note, I've noticed when searching for solutions to this that a lot of users actually think their write speeds are as slow as fio reports them, and ask questions to understand why their writes are so slow, I even saw NVME drive tests where I saw write speeds at half of what the reads were, and nobody seemed to even notice that something was wrong... so this issue has some more unwanted side effects, than just me not getting consistent benchmarks)

Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232
Cestarian
  • 2,051
  • 1
    @sourcejedi I'm pretty sure it defaults to libaio (I forgot to set it for this command, but the results are consistent with what I got from that, I will update it in a sec, thanks) and yes --fdatasync=1; it could be that the results are just all over the place, because I've been getting results from the ~90MB/s to ~150MB/s, without fdatasync so it could have been a coincidence... Yeah, I tested it again and it must have been a coincidence, because I ran this test twice with libaio, and the second one got ~154MB/s speeds then again with fdatasync and it just went to ~155MB/s. I will remove this – Cestarian Nov 07 '18 at 06:45
  • @sourcejedi threw in a few updates to the question. – Cestarian Nov 07 '18 at 11:16
  • 1
    For some reason I can't see @sourcejedi's (normally handy!) comments so I'm taking some guesses at what they wrote. Fio 3.12 defaults to psync ioengine on Linux. Out of curiosity, @Cestarian is this you cross posting this to the fio github - https://github.com/axboe/fio/issues/711 ? I see an answer there is hinting that the data used matters (I guess they're also suggesting that things look like non-bug questions get better answers on the fio mailing list)... – Anon Nov 11 '18 at 06:04
  • @Anon yes since I was failing to get an answer here, I posted on github as well, I assumed this to be a bug but it in fact wasn't, I explained what I was eventually told in an answer below. As for why I didn't just do as asked, mailing lists are an archaic method of communication and I don't bloody understand it :p – Cestarian Nov 11 '18 at 16:54
  • 2
    @Anon :-) I delete my comments when they have been answered by self-contained question edits. Cestarian: thanks for writing this up for others. Please be considerate about bug-reporting channels. Complain about barriers to entry on whatever forum you like, but it's not very good form to do so in a reply to the busy maintainer who just solved your non-bug question. If you're using Linux and you want to post plain text emails, Thunderbird has options for it that work great for me, Thunderbird connects to GMail very nicely. Remember to use "reply all" & don't "top-post" and you're good. – sourcejedi Nov 11 '18 at 21:09
  • @sourcejedi Ah, I understand. I was more familiar with (still) seeing your comments dotted about the place and when I find them I consider your advice sage. I hope you realise your help, hints and nudges are quitely appreciated by others... – Anon Nov 11 '18 at 21:32
  • blushes @Anon. It is weird and I'm not sure about it... like StackExchange in general. We produce some excellent pages, but the most helpful comment is supposed to become redundant when it is answered. I feel like it becomes hard to learn from other people's comment styles. Maybe it would be better to have something more like chat than comments. It could be collapsed automatically after a time, except on unanswered questions. – sourcejedi Nov 11 '18 at 22:22

1 Answers1

2

The reason for the difference was explained to me by the author of fio: the dd command I used was writing zeroes whereas fio was using random data by default. Setting --zero_buffers=1 solves the issue.

fio --ioengine=libaio --size=1024m --filename=$HOME/tempfile --direct=1 --loops=5 --name=test --bs=1024m --rw=write --zero_buffers=1
....
  write: IOPS=0, BW=495MiB/s (519MB/s)(5120MiB/10339msec); 0 zone resets

and

fio --ioengine=libaio --size=1024m --filename=$HOME/tempfile --direct=1 --loops=5 --name=test --bs=1m --rw=write --zero_buffers=1
....
  write: IOPS=474, BW=474MiB/s (497MB/s)(5120MiB/10798msec); 0 zone resets

Now has much closer results. (It should be noted that while these results do indeed show an approximation of the drive's theoretical maximum write speeds; testing without this option (using randomized buffers) will provide results more indicative of the real-world i/o performance.)

Cestarian
  • 2,051
  • 1
    NB: using a data buffer of zeros (or data that is highly compressible/de-dupable) for writes can lead to unrealistically high benchmark results. See the https://fio.readthedocs.io/en/latest/fio_doc.html#buffers-and-memory for fio's options and also see the quote from Torvalds in a "How can I benchmark my HDD?" answer. – Anon Nov 11 '18 at 18:41
  • Yeah, I noticed it's not really indicative of real world performance, it's the theoretical maximum performance at best. This whole thing is making me feel somewhat disappointed with the i/o performance on Linux though, because the same kinds of tests (with randomized buffer) on windows (with crystaldiskmark) provide much write speeds much closer to the read speeds than the same kind of tests on linux do. – Cestarian Nov 11 '18 at 18:45
  • For boring reasons, it's not even a theoretical maximum - it's just a special case. You may even find yourself exceeding your disk's quoted bandwidth (which is a benchmark no-no). – Anon Nov 11 '18 at 18:50
  • 1
    @Cestarian extra-ordinary claims require extra-ordinary evidence. Prove it :-P. You can find plenty of issues with Linux, but it does not suffer a performance degradation from 500 to 150 MB/s (less than 1/3rd of the speed!) compared to Windows, for sequential IO on a SATA SSD. – sourcejedi Nov 11 '18 at 21:17
  • 1
    @sourcejedi here https://unix.stackexchange.com/a/480191/72554 I ran a test on crystaldiskmark on windows (ntfs) and the same test on fio on linux (ext4); although it's not an apples to apples test (since different benchmarking software was used) these preliminary results indicate that on linux most tests have results between 50-100MB/s slower (something in the range of 10-30% slower in general), I would like to test it properly with fio on both windows and linux, but I'm not sure about the best way to run that script from windows. – Cestarian Dec 20 '20 at 17:05