10

Is there any difference between doing i.e. dd bs=4M if=archlinux.iso of=/dev/sdx status=progress oflag=sync or doing cp archlinux.iso /dev/sdx && sync, and reason to use one over the other? (aside from the pretty progress bar in dd)

Valeriy
  • 229

2 Answers2

11

One difference is efficiency, and thus speed. For example, you could get the bytes one by one and copy them to the device, with cat if it had the idealized implementation or in older systems, for example BSD4:

cat archlinux.iso > /dev/sdx

In these implementations cat will move each byte independently. That is a slow process, although in practice there will be buffers involved. Note that modern cat implementations will read blocks (see below).

With dd and a good block size it will be faster.

With cp it depends on the buffer size used by cp (not under your control) and other buffers on the way. The efficiency lies between the idealized implementation of cat and dd with the optimum block size.

In practice though modern cat and cp will ask the system for the preferred block size: st_blksize. Note that this doesn't have to be the optimum block size.

An analogy: it is like pouring the contents of a glass into another glass.

  • idealized cat would do it one drop at a time.

  • dd will use a spoon, and you define exactly how big the spoon is (system limits apply)

  • cp and modern cat will use its own spoon (stat -f -c %s filename will tell you how big it is).

  • 2
    The analogy to the glass is really nice! – Panki Dec 20 '19 at 15:06
  • This answer is completely false. "In theory cat will move each byte independently." - No, it will not. cat reads and writes blocks just like dd and cp do. Modern cat (e.g. GNU cat) actually asks the OS what the preferred block size is, for optimum speed. On my system, cat uses 128KiB blocks, compared to dd which only moves 512 bytes at a time. There's no reason to use dd here. And where it matters, it's slower except if you manually match the block size. – marcelm Feb 25 '22 at 19:09
  • "With dd and a good block size (usually related to the physical block size) it will be faster." - More misunderstandings; optimal block size is completely unrelated to physical block size (typically 512B, sometimes 4kB). The OS will take larger amounts of data and write it out to multiple physical blocks, no problem. The concerns for userspace applications in choosing block size have to do with 1) minimizing the number of system calls and 2) optimizing CPU cache usage. This is why cat uses 128KiB; large enough that the number of syscalls is small, and fits CPU caches well. – marcelm Feb 25 '22 at 19:16
  • @marcelm thanks for the input. I hope the edit and the links make it clearer. – Eduardo Trápani Mar 02 '22 at 05:00
0

I use it mainly because of the status=progress you mentioned; what can I say, I am impatient and need to know :-)

Even if you forgot to add that and started the job, you can send it a SIGUSR1 signal and it will print the current I/O statistics to stderr (which, unless you redirected it, is your terminal).