Bear in mind that dd
is a raw interface to the read()
, write()
and lseek()
system call. You can only use it reliably to extract chunks of data off regular files, block devices and some character devices (like /dev/urandom
), that is files for which read(buf, size)
is guaranteed to return size
as long as the end of the file is not reached.
For pipes, sockets and most character devices (like ttys), you have no such guarantee unless you do read()
s of size 1, or use the GNU dd
extension iflag=fullblock
.
So either:
{
gdd < file1 bs=1M iflag=fullblock count=99 skip=1
gdd < file2 bs=1M iflag=fullblock count=10
} > final_output
Or:
M=1048576
{
dd < file1 bs=1 count="$((99*M))" skip="$M"
dd < file2 bs=1 count="$((10*M))"
} > final_output
Or with shells with builtin support for a seek operator like ksh93
:
M=1048576
{
command /opt/ast/bin/head -c "$((99*M))" < file1 <#((M))
command /opt/ast/bin/head -c "$((10*M))" < file2
}
Or zsh
(assuming your head
supports the -c
option here):
zmodload zsh/system &&
{
sysseek 1048576 && head -c 99M &&
head -c 10M < file2
} < file1 > final_output
oflag=append conv=notrunc
), so filesystems that do delayed allocation (like XFS) are least likely to decide the file is done being written when there's still more to go. – Peter Cordes May 02 '16 at 15:46dd
isn't asked tosync
, delayed allocation shouldn't kick in immediately anyway (unless memory is tight in which case neither method will postpone allocation). – Stephen Kitt May 02 '16 at 15:52bash
andmksh
that don't optimize out the fork for the last command in a subshell, you can make it slightly more efficient by replacing the subshell with a command group. For other shells, it shouldn't matter, and the subshell approach might even be slightly more efficient as the shell doesn't need to save and restore stdout. – Stéphane Chazelas May 02 '16 at 16:13