15

I have the below shell script and I wonder whether oflag's direct does the syncing automatically or if it is explicitly required:

dd bs=10M oflag=direct,sync of=ofile.bin

Also what is the difference in saying oflag=sync and conv=sync and conv=fsync?

What impact does it have if I change the code to the line below?

dd bs=10M conv=fsync oflag=direct of=ofile.bin
  • 1
    direct uses direct i/o, without buffer cache (check your blocksize though), oflag=sync uses synchronous data/metadata processing. conv=fsync uses sync() call after processing, conv=sync pads input blocks with zeroes to the blocksize. – stoney Mar 26 '19 at 11:04

1 Answers1

23

We can probably rule out conv=sync to start with. It does something rather different, which I expect you do not want :-).

pad every input block with NULs to ibs-size; when used with block or unblock, pad with spaces rather than NULs


oflag=direct does not sync automatically on its own.[*]

conv=fsync differs from oflag=sync. oflag=sync effectively syncs after each output block. conv=fsync does one sync at the end.

The end result is the same, but the performance along the way is different :-).

  1. oflag=sync could be significantly slower. You can mitigate this, by increasing the block size.

  2. If device-specific caches are large[1], this will affect the progress reported e.g. by the status=progress option.

  3. If you do not use oflag=direct, then large amounts of writes can build up in the system page cache. This build-up will affect the progress you see[2]. But also, Linux sometimes responds badly to the build-up, and degrades performance for all devices[3].


[1] "Apparently your hardware has hundreds of megabytes of cache... In my case, it is because the kernel is [actually running inside a virtual machine]". https://unix.stackexchange.com/a/420300/29483

[2] Why does a gunzip to dd pipeline slow down at the end?

[3] System lags when doing large R/W operations on external disks

[*] When you write directly to a block device node, Linux syncs the block device when it is closed (and is not open by any other program). See: Block device cache v.s. a filesystem. Sometimes I see people who do not use an explicit sync when writing to a block device. It will often seem to work OK... until it doesn't. So I recommend at least using conv=fsync.

sourcejedi
  • 50,249
  • Thank you very much for your answer. It explains everything clearly except oflag=direct. I still don't understand why you don't recommend direct option. I prefer direct and I think that will direct write to file/device without worrying about the cache and memory. Can you explain a bit more on oflag=direct? Thanks a lot. – sgon00 Jun 29 '21 at 05:49
  • Thanks a lot for the quick reply. Actually I asked another dd question before receiving your response in https://unix.stackexchange.com/questions/656264. Nobody replied me yet. If you have time, you can have a look. Thanks a lot. – sgon00 Jun 29 '21 at 09:17
  • Ah, the thing I'm saying not to rely on, is that oflag=direct on a Linux block device sort-of implies conv=fsync. Except that it does not report errors, and it won't fsync if something else also had the device open when you closed it. I think if you want fsync behaviour, you should ask for it explicitly. Edited to be more clear. – sourcejedi Jun 29 '21 at 18:06