7

If a exists and I type

cp /dev/zero a

can I be sure the old contents of a will be overwritten, or will I simply get the equivalent of

rm a
cp /dev/zero a

PS. I am not suggesting that this is the correct way securely to delete a file; I am merely curious about the effect of a certain command.

Toothrot
  • 3,435
  • 2
    It depends on the medium the filesystem is on and the permissions/ownership of the target file. If you your question with the details you'll likely get an accurate answer – Chris Davies Aug 01 '20 at 13:18
  • 1
    If you want to securely delete files, have a look at shred, it is specifically designed for that. Caveats apply, of course. – marcelm Aug 01 '20 at 20:52
  • 2
    Note that /dev/zero can be read indefinitely; cp won't end until its writes fail (e.g. with ENOSPC) or you kill it. Was that intentional to fill the entire free space (including the just-truncated old contents of a) with zeros? That's a very slow way to go about it on a non-CoW FS. – Peter Cordes Aug 02 '20 at 02:01
  • 2
    This is an extreme XY problem; even shred isn't reliable on SSDs. You're looking for "secure delete". – chrylis -cautiouslyoptimistic- Aug 02 '20 at 03:59
  • The answer is: neither. You can't be sure the old content's will be overwritten, but it's also not equivalent to rm a; cp /dev/zero a, because it will keep the same i-node. – David Ongaro Aug 02 '20 at 06:12
  • shred isn't even reliable on journalling filesystems. And for COW filesystems like btrfs you're totally out of luck. – allo Aug 02 '20 at 10:06
  • @chrylis-cautiouslyoptimistic-, what is secure delete? – Toothrot Aug 02 '20 at 13:14
  • @DavidOngaro Wouldn't that be dependent on the implemenation of cp? – Thorbjørn Ravn Andersen Aug 02 '20 at 14:03
  • @ThorbjørnRavnAndersen, even so that means it isn't equivalent in general. But practically all unix implementations are working like this (https://unix.stackexchange.com/a/227148/46085) and it's even spelled out in POSIX (https://pubs.opengroup.org/onlinepubs/9699919799/utilities/cp.html#:~:text=A%20file%20descriptor,function). – David Ongaro Aug 03 '20 at 00:53
  • Read the posix manual page. cp tries to rewrite the file (keeping the inode) but if that fails for any reason, it deletes the file and write it again – Thorbjørn Ravn Andersen Aug 03 '20 at 01:06

3 Answers3

9

I suspect that, when you open a file to write it from the start, all the existing blocks will be freed immediately (still containing their existing contents), and your cp will acquire new blocks to zero-fill.

Further, your file will be extended until it fills the whole partition: the original size will not be honoured.

The dd command has an option conv=notrunc, and you would be able to find the original size using stat, round that up to whole blocks, and use the count and bs options to size the zeros to the same blocks as the original.

Edit: Confirmed by test that a shell > redirect retains same inode number, but immediately reduces file size to zero blocks and releases them to available space.

strace on cp shows vanilla overwrite of the file via mmap area:

open("foo.tiny", O_RDONLY)              = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=8192, ...}) = 0
open("foo.copy", O_WRONLY|O_CREAT|O_EXCL, 0644) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
mmap(NULL, 139264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc295d30000
read(3, "\0\0\0\0"..., 131072) = 8192
write(4, "\0\0\0\0"..., 8192) = 8192
read(3, "", 131072)                     = 0
close(4)                                = 0
close(3)                                = 0
munmap(0x7fc295d30000, 139264)          = 0
lseek(0, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
close(0)                                = 0
close(1)                                = 0
close(2)                                = 0

Conclusion has to be that cp will discard your confidential data into the free list, and overwrite whatever blocks it gets allocated afterwards.

This script demonstrates that dd will zeroise part or all of the existing blocks in a file, without freeing blocks, and without changing the inode. You may need to capture the actual BLKSZ and file Size from a stat command, and to use bash arithmetic to round up the size to a whole number of blocks. Also shows that dd will extend a file, and will write sparse data.

This produces about 100 lines of output, so I won't post that. It is benign. The Zero function is the meat of it.

#! /bin/bash

FN='./Erase.data' BLKSZ=4096

#.. Zeroise a specified range of blocks (zero-based).

Zero () { #:: (from, to)

dd 2>&1 ibs="${BLKSZ}" obs="${BLKSZ}" \
    seek="${1}" count="$(( $2 - $1 + 1 ))" \ 
    conv=notrunc if="/dev/zero" of="${FN}"

}

#.. Create a file of 8 * 4096B blocks, each labelled in every character.

Make () { #:: (void)

AWK='   

function Block (sz, id, Local, buf) { buf = sprintf ("%*s", sz, ""); gsub (/./, id, buf); printf ("%s", buf); } { for (f = 2; f <= NF; ++f) Block( $1, $(f)); } ' echo "${BLKSZ}" {A..H} | awk "${AWK}" > "${FN}" }

#.. Reveal the file.

Show () {

echo; ls -l &quot;${FN}&quot;; stat &quot;${FN}&quot;; od -A d -t a &quot;${FN}&quot;; sleep 2 

}

Script Body Starts Here.

#.. Make the file and prove its contents.
Make &gt; &quot;${FN}&quot; &amp;&amp; Show 
Zero 3 6 &amp;&amp; Show 
Zero 0 1 &amp;&amp; Show 
Zero 0 7 &amp;&amp; Show 
Zero 220 231 &amp;&amp; Show 

This is an approximation to a production version.

#! /bin/bash

Usage () { expand -t 4 <<'EOF'

Usage: ZeroAllBlocks [-h] [files ...]
    Warning: this command is as brutal as rm -f.
    -h: shows this message.

    Zeroises (binary zero) all blocks of all the files named.
    Sparse blocks will then consume real disk space.

EOF }

#.. Zeroise a specified range of blocks (zero-based).

Zero () { #:: (Fn, blksz, seek, count)

local Fn=&quot;${1}&quot; blksz=&quot;${2}&quot; seek=&quot;${3}&quot; count=&quot;${4}&quot;

dd status=none ibs=&quot;${blksz}&quot; obs=&quot;${blksz}&quot; \ 
    seek=&quot;${seek}&quot; count=&quot;${count}&quot; \
    conv=notrunc if=&quot;/dev/zero&quot; of=&quot;${Fn}&quot;

}

#.. Process a file.

File () { #:: (filename)

local Fn=&quot;${1}&quot; szFile szBlock nBlock

[[ -f &quot;${Fn}&quot; ]] || { printf '%s: No such file\n' &quot;${Fn}&quot;; return; }
[[ -w &quot;${Fn}&quot; ]] || { printf '%s: Not writable\n' &quot;${Fn}&quot;; return; }
read -r szFile szBlock &lt;&lt;&lt;$( stat --printf='%s %o\n' &quot;${Fn}&quot; )
nBlock=&quot;$(( (szFile + szBlock - 1) / szBlock ))&quot;
Zero &quot;${Fn}&quot; &quot;${szBlock}&quot; 0 &quot;${nBlock}&quot;

}

Script Body Starts Here.

[[ &quot;${1}&quot; = &quot;-h&quot; ]] &amp;&amp; { Usage; exit 2; }

for Fn in &quot;${@}&quot;; do File &quot;${Fn}&quot;; done

Paul_Pedant
  • 8,679
  • 1
    Your strace output is from the case where the destination doesn't exist, so cp used open(O_WRONLY|O_CREAT|O_EXCL). That fails if the file exists, that's why it didn't bother to include O_TRUNC. If its earlier stat (which you omitted) shows that the destination does exist (and you didn't use cp -i, or you confirm at the prompt), it uses O_WRONLY|O_TRUNC. (Which matches your text, not your strace output). – Peter Cordes Aug 02 '20 at 01:57
  • mmap is just allocating a buffer to read into / write from. Your wording is a bit confusing, but it's not doing an mmap on the output file. Instead it's MAP_ANONYMOUS, not backed by any file. – Peter Cordes Aug 02 '20 at 01:59
  • Edit: Confirmed by test that a shell > redirect retains same inode number I did a test yesterday and even doing an explcicit rm (rm ${file}; cp ${something} ${file} or rm ${file}; echo "Something" >${file}) the new file re-used the same inode number (ext4 FS). But it's a moot point, what counts is the allocated blocks anyway. – xenoid Aug 02 '20 at 07:54
9

Why do you want to copy /dev/zero to some file?

  • Do you want to securely delete a? Then cp is the wrong tool. Look into shred if you're using a hard disk, and fstrim if you're using an SSD. (There are very likely other tools I'm not aware of, so googling "secure delete" would be good.)

  • Do you understand that /dev/zero is infinite in size? It will return zeros indefinitely and so cp will never finish until the file system containing a fills up.

  • But, that might not ever happen because cp, as part of its sparse file detection, will suppress writes of all-zero pages.

  • Do you want to create a file of a given size that is full of zeros? cp is still the wrong tool. You'll need to use dd so you can specify how many bytes to copy from /dev/zero.

Overall, use cp for copying an ordinary file to another ordinary file, where all you are concerned about is that the copy is logically equivalent.

If you're working with device files, or you're trying to control how the blocks of the original file are disposed of, you'll want to use another program.

  • What about blkdiscard(8) vs shred? – iBug Aug 02 '20 at 08:10
  • @iBug I don't know anything about it. It wasn't my intention to suggest that shred was the only choice for the secure deletion problem, but I'll edit the text to make that clearer, and include a mention of blkdiscard, once I read up on it. – Dale Hagglund Aug 02 '20 at 08:13
  • @iBug Turns out blkdiscard isn't applicable to the problem of securely deleting a file. It requires you to provide explicit block ranges to discard, and you can't get the blocks ranges allocated to a file as far as I'm aware. Even fstrim isn't really per file, but at least it works on the free list, but even after issuing a TRIM command, you don't know if the actual flash block containing the old data has been erased. It's a little harder to get to. – Dale Hagglund Aug 02 '20 at 08:20
  • cp will not create a sparse file from a non-sparse one unless you specify --sparse=always – Dmitry Grigoryev May 21 '21 at 07:36
4

It depends on the filesystem and storage device used. Generally filesystems will overwrite when you tell them to, unless you are using a filesystem optimized for raw flash memory; those special filesystems might implement wear leveling at the FS level and so would write into a different location.

But if the storage device is a regular SATA or NVMe SSD, it might do wear leveling internally anyway, and the real physical storage blocks will be different from what even the raw block device will show. So the "overwrite" will end up going to a different physical location even though the filesystem thinks it is definitely overwriting a particular block #.

But getting past the SSD's wear leveling system and reading the raw storage meaningfully should be a non-trivial technical hurdle, requiring specialist knowledge and possibly special hardware tools. And the SSD will probably pre-emptively erase any "overwritten" blocks as soon as practical anyway, to have as much erased blocks available for writing as possible, as the erasing is generally the limiting factor of SSD performance.

telcoM
  • 96,466
  • 2
    Yes, traditional filesystems will overwrite when you tell them to, but the question is whether cp /dev/zero a "tells it to". It doesn't; it truncates first so there's no reason to expect it to overwrite any blocks. Unless the OP really intended to fill the entire free space (including the just-freed old contents of a) with zeros. – Peter Cordes Aug 02 '20 at 02:03
  • 1
    A regular rm followed by fstrim is going to be faster, more efficient, and more reliable than trying to overwrite onto an SSD. – chrylis -cautiouslyoptimistic- Aug 02 '20 at 04:00