32

Can some explain me what is happening in the following lines?

dd if=/dev/urandom bs=4096 seek=7 count=2 of=file_with_holes

especially seek part is not clear

Man pages says :

 seek=BLOCKS
              skip BLOCKS obs-sized blocks at start of output

What is obs-sized block?

user2799508
  • 1,712
  • I dont think seek will change the data its skipping over, so no, i dont think empty zeroed blocks will appear before the block you are seeking to, whatever data was there would remain. – nkshirsa Jun 10 '20 at 13:42

4 Answers4

29

dd is designed to copy blocks of data from an input file to an output file. The dd block size options are as follows, from the man page:

ibs=expr
    Specify the input block size, in bytes, by expr (default is 512).
obs=expr
    Specify the output block size, in bytes, by expr (default is 512).
bs=expr
    Set both input and output block sizes to expr bytes, superseding ibs= and obs=.

The dd seek option is similar to the UNIX lseek() system call1. It moves the read/write pointer within the file. From the man page:

seek=n
    Skip n blocks (using the specified output block size) from the beginning of the output file before copying. 

Ordinary files in UNIX have the convenient property that you do not have to read or write them starting at the beginning; you can seek anywhere and read or write starting from there. So bs=4096 seek=7 means to move to a position 7*4096 bytes from the beginning of the output file and start writing from there. It won't write to the portion of the file that is between 0 and 7*4096 bytes.

Areas of ordinary files that are never written to at all aren't even allocated by the underlying filesystem. These areas are called holes and the files are called sparse files. In your example, file_with_holes will have a 7*4096-byte hole at the beginning. (h/t @frostschutz for pointing out that dd truncates the output file by default.)

It is OK to read these unallocated areas; you get a bunch of zeroes.

[1] back when dd was written, the analogous system call was seek().

Mark Plotnick
  • 25,413
  • 3
  • 64
  • 82
  • Interesting, my man page is annoyingly unforthcoming on this - bs=BYTES read and write up to BYTES bytes at a time – Graeme Jan 11 '14 at 14:02
  • I haven't seen "seek" on UNIX, Maybe "lseek" I guess. – kangear Oct 31 '14 at 01:35
  • @kangear seek was the predecessor of lseek. I've updated my answer to be clearer. Thanks. – Mark Plotnick May 26 '17 at 20:39
  • 2
    Just to note, I was trying to seek a drive device (exmaple: dd if=/dev/zero bs=512 count=2 seek=8388607998 of=/dev/sdd), but those 'files'/descriptors are not seekable: dd: /dev/sdd: cannot seek: Invalid argument 0+0 records in 0+0 records out 0 bytes copied, 0.00765396 s, 0.0 kB/s – Pysis May 17 '18 at 02:02
  • 1
    @Pysis Disk devices are usually seekable, but maybe there are some issues with very large devices. How large (in bytes) is your /dev/sdd ? – Mark Plotnick May 17 '18 at 07:20
  • 1
    Maybe I have before and can't remember. I'm trying to access the backup GPT sector or 2 at the end of a 4TB disk. – Pysis May 17 '18 at 15:26
8

The other answers explained it already, but if you have any doubts, you can see what dd does with strace.

$ strace dd if=/dev/urandom bs=4096 seek=7 count=2 of=file_with_holes
# output is shortened considerably
open("/dev/urandom", O_RDONLY)          = 0
open("file_with_holes", O_RDWR|O_CREAT, 0666) = 1
ftruncate(1, 28672)                     = 0
lseek(1, 28672, SEEK_CUR)               = 28672
read(0, "\244\212\222v\25\342\346\226\237\211\23\252\303\360\201\346@\351\6c.HF$Umt\362;E\233\261"..., 4096) = 4096
write(1, "\244\212\222v\25\342\346\226\237\211\23\252\303\360\201\346@\351\6c.HF$Umt\362;E\233\261"..., 4096) = 4096
read(0, "~\212q\224\256\241\277\344V\204\204h\312\25pw9\34\270WM\267\274~\236\313|{\v\6i\22"..., 4096) = 4096
write(1, "~\212q\224\256\241\277\344V\204\204h\312\25pw9\34\270WM\267\274~\236\313|{\v\6i\22"..., 4096) = 4096
close(0)                                = 0
close(1)                                = 0
write(2, "2+0 records in\n2+0 records out\n", 312+0 records in
2+0 records out
) = 31
write(2, "8192 bytes (8.2 kB) copied", 268192 bytes (8.2 kB) copied) = 26
write(2, ", 0.00104527 s, 7.8 MB/s\n", 25, 0.00104527 s, 7.8 MB/s
) = 25
+++ exited with 0 +++

It opens /dev/urandom for reading (if=/dev/urandom), opens file_with_holes for create/write (of=file_with_holes).

Then it truncates file_with_holes to 4096*7=28672 bytes (bs=4096 seek=7). The truncate means that file contents after that position are lost. (Add conv=notrunc to avoid this step). Then it seeks to 28672 bytes.

Then it reads 4096 bytes (bs=4096 used as ibs) from /dev/urandom, writes 4096 bytes (bs=4096 used as obs) to file_with_holes, followed by another read and write (count=2).

Then it closes /dev/urandom, closes file_with_holes, and prints that it copied 2*4096 = 8192 bytes. Finally it exits without error (0).

frostschutz
  • 48,978
5

obs is the output block size and ibs is the input block size. If you specify bs without ibs or obs this is used for both.

So your seek will be 7 blocks of 4096 or 28672 bytes at the start of your output. Then you will copy 2 blocks of 4096 or 8192 bytes from the start of input to this point in the output.

Graeme
  • 34,027
1

Seek will just "inflate" the output file. Seek=7 means that at the beginning of the output file, 7 "empty" blocks with output block size=obs=4096bytes will be inserted. This is a way to create very big files quickly.

  • 1
    Or to skip over data at the start which you do not want to alter. Empty blocks only result if the output file initially did not have that much data. The manual is also not clear on how obs relates to bs, the command uses bs which will substitute obs if it is not there. – Graeme Jan 11 '14 at 12:10