27

From the Arch Linux Wiki: https://wiki.archlinux.org/index.php/USB_flash_installation_media

# dd bs=4M if=/path/to/archlinux.iso of=/dev/sdx status=progress && sync

[...] Do not miss sync to complete before pulling the USB drive.

I would like to know

  • What does it do?
  • What consequences are there if left out?

Notes

dd command used with optional status=progress:

tar -xzOf archlinux-2016-09-03-dual.iso | dd of=/dev/disk2 bs=4M status=progress && sync

Or using pv for progress

tar -xzOf archlinux-2016-09-03-dual.iso | pv | dd of=/dev/disk2 bs=4M && sync
Jonathan Komar
  • 6,424
  • 7
  • 35
  • 53

2 Answers2

45

The dd does not bypass the kernel disk caches when it writes to a device, so some part of data may be not written yet to the USB stick upon dd completion. If you unplug your USB stick at that moment, the content on the USB stick would be inconsistent. Thus, your system could even fail to boot from this USB stick.

Sync flushes any still-in-cache data to the device.

Instead of invoking sync you could use fdatasync dd's conversion option:

fdatasync

physically write output file data before finishing

In your case, the command would be:

tar -xzOf archlinux-2016-09-03-dual.iso | \
dd of=/dev/disk2 bs=4M status=progress conv=fdatasync

The conv=fdatasync makes dd effectively call fdatasync() system call at the end of transfer just before dd exits (I checked this with dd's sources).

This confirms that dd would not bypass nor flush the caches unless explicitly instructed to do so.

Serge
  • 8,541
  • 1
    Thanks for your contribution, however I am not sure that this statement is correct The dd does not bypass the kernel disk caches when it writes to a device. When writing to a file (over the file system layer of the kernel), things are cached. However, I am concerned about writing to devices. Please provide a source for that statement if you can, because that is the linchpin of this question. If true, it would provide a valid reason for running sync after a dd-to-device operation. – Jonathan Komar Sep 28 '16 at 14:30
  • Yes it is cached. The caching happens inside block device infrastructure of the kernel. The file operations itself are not cached. the underlying block device interface does cashing. source: http://lxr.free-electrons.com/source/block/blk-flush.c – Serge Sep 28 '16 at 14:45
  • @macmadness86 see the updated answer – Serge Sep 28 '16 at 15:29
  • 14
    I preffer using oflag=sync, so progress outputs the real transfer speed and not the cached one (so going a steady 10MB/s instead of one second 100MB/s and then 10 seconds of stall). – Bart Polot Dec 21 '17 at 10:29
  • Writing to a block device bypasses VFS altogether. In other words: writing to a file can be cached by the kernel (and it usually is) but writing to a device is never cached by the kernel (and it can't). – Eric Jan 06 '20 at 12:55
  • @Eric That is definitely not correct. See, for example, the citation by Serge into the Linux kernel source code. (Which,updated to fix the dead link is: https://elixir.bootlin.com/linux/latest/source/block/blk-flush.c) – Keeley Hoek Nov 01 '20 at 16:57
  • @Keely Hoek the code you pointed out is not part of VFS. I said that when you open a device you bypass VFS. System calls like read() and write() are directly handled by the device driver, not the VFS layer. And so theVFS layer is basically unable to cache anything since it's not being involved. – Eric Nov 07 '20 at 19:53
  • @Serge i would really appreciate if you can also mention conv=fsync in your answer and explain why you rather use/recommend one or the other. – DJCrashdummy Dec 05 '21 at 04:21
  • The Arch USB flash medium guide now uses the options oflag=direct conv=fsync. fsync does the same as fdatasync and and additionally syncs file metadata like st_atime, st_mtime (again no need for an explicit call to sync after running dd). The explanation of oflag=direct leaves much to be desired in man dd, but info dd explains that it uses "[..] direct I/O for data, avoiding the buffer cache.". – Jonathan Komar Jan 23 '22 at 07:56
  • 1
    @JonathanKomar, @DJCrashdummy, yes, that's why both options have absolutely the same effect when we are writing directly to a block device: there is no fs metadata to sync. As for oflag=direct - it's interesting, I will check... – Serge Jan 23 '22 at 09:44
  • 1
    @Serge Any update with oflag=direct? – Jonathan Komar Apr 26 '22 at 19:42
1

From the sync(1) manual page : "sync - Synchronize cached writes to persistent storage". Basically sync makes sure that all your data is written to the stick from the cache.

schaiba
  • 7,631