16

I used shred to wipe my external hard disk: sudo shred -vz /dev/sdb

I should also add that the disk had 5 bad sectors.

I want to verify the disk has been zeroed, per https://superuser.com/questions/1510233/is-there-a-faster-way-to-verify-that-a-drive-has-been-fully-zeroed

I'm not that familiar with dd, but I believe that these show it's been zeroed:

sudo dd if=/dev/sdb status=progress | hexdump
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
5000916670976 bytes (5.0 TB, 4.5 TiB) copied, 45754 s, 109 MB/s
9767541167+0 records in
9767541167+0 records out
5000981077504 bytes (5.0 TB, 4.5 TiB) copied, 45756.7 s, 109 MB/s
48c61b35e00
sudo dd if=/dev/sdb status=progress | od | head
5000952267264 bytes (5.0 TB, 4.5 TiB) copied, 45739 s, 109 MB/s
9767541167+0 records in
9767541167+0 records out
5000981077504 bytes (5.0 TB, 4.5 TiB) copied, 45741.1 s, 109 MB/s
0000000 000000 000000 000000 000000 000000 000000 000000 000000
*
110614154657000

But using a simple cmp shows an exception:

sudo cmp /dev/zero /dev/sdb
cmp: EOF on /dev/sdb after byte 5000981077504, in line 1

Has the disk been zeroed?

NoExpert
  • 489
  • 5
    Quite frankly the only way you can know you've destroyed the data is by destroying the drive, or disassembling it (if a regular HDD) and waving a degausing wand around each platter. Even if you write all zeros, data can be recovered - so some SW will write random data multiple times. Again, only applies to magnetic HDD. – Steve Aug 10 '22 at 22:17
  • 1
    @Steve: For purposes of protecting actual classified information, or something that needs to be treated with eg PCI/SOC confidentiality, or anything that would be worth >$10000 for someone to recover, or anything that could mean someone will get harmed, anything that would have value to an organized and funded advanced threat actor: THIS. Anything else: Even plain zeroing (on mag disks. know what a HPA is!), or any ATA secure erase, will make that disk an impractical target for opportunists. – rackandboneman Aug 11 '22 at 17:55
  • The disk is being returned as faulty. I usually physically destroy drives, but this is pretty new and cost $150. ATA secure erase is not supported (it's an external disk) per https://github.com/Seagate/openSeaChest/blob/develop/docs/openSeaChest_Erase.212.txt. I understand that shred does three passes with pseudo-random data (https://wiki.archlinux.org/title/Securely_wipe_disk#shred) and I added a final zero overwrite, for good measure. Per https://unix.stackexchange.com/questions/626847/check-for-host-protected-area-and-device-configuration-overlay/626848#626848 there's no HPA/DCO. – NoExpert Aug 11 '22 at 20:37
  • 5
    The Lord of the Rings algorithm is quite effective: take the hard drive to a suitable volcano, and throw it in. – Simon Crase Aug 12 '22 at 01:56
  • What is the value of the time that you devote to wiping the disk? Let's say $X. What is the potential loss to you or your company if the data falls into the wrong hands? Let's say $Y. Then if $(X+Y) > $150, destroy it, if not then make sure you document the decision in case a scapegoat is needed. –  Aug 12 '22 at 15:12
  • @Martin, a sensible comment, but I guess one would be assigning a probability that the data is recovered to that equation. It's difficult to justify simply throwing away a disk, where the manufacturer offers a free return/replacement. – NoExpert Aug 12 '22 at 19:02

3 Answers3

29

Has the disk been zeroed?

Yes. The output of your dd command shows that it has written 5000981077504 bytes. Your cmp command says that it's reached EOF (end of file) after 5000981077504 bytes, which is the same.

Be aware that this only works well with hard drives. For solid-state devices, features such as wear leveling and overprovisioning space may result in some data not being erased. Furthermore, your drive must not have any damaged sectors, as they will not be erased.

Note that cmp will not be very efficient for this task. You would be better off with badblocks:

badblocks -svt 0x00 /dev/sdb

From badblocks(8), the -t option can be used to verify a pattern on the disk. If you do not specify -w (write) or -n (non-destructive write), then it will assume the pattern is already present:

   -t test_pattern
          Specify a test pattern to be read (and written) to disk  blocks.
          The  test_pattern  may  either  be a numeric value between 0 and
          ULONG_MAX-1 inclusive, or the  word  "random",  which  specifies
          that  the block should be filled with a random bit pattern.  For
          read/write (-w) and non-destructive (-n) modes, one or more test
          patterns  may  be specified by specifying the -t option for each
          test pattern desired.  For read-only mode only a single  pattern
          may  be specified and it may not be "random".  Read-only testing
          with a pattern assumes that the specified pattern has previously
          been  written to the disk - if not, large numbers of blocks will
          fail verification.  If multiple patterns are specified then  all
          blocks  will be tested with one pattern before proceeding to the
          next pattern.

Also, using dd with the default block size (512) is not very efficient either. You can drastically speed it up by specifying bs=256k. This causes it to transfer data in chunks of 262,144 bytes rather than 512, which reduces the number of context switches that need to occur. Depending on the system, you can speed it up even more by using iflag=direct, which bypasses the page cache. This can improve read performance on block devices in some situations.


Although you didn't ask, it should be pointed out that shred overwrites a target using three passes by default. This is unnecessary. The myth that multiple overwrites is necessary on hard disks comes from an old recommendation by Peter Gutmann. On ancient MFM and RLL hard drives, specific overwrite patterns were require to avoid theoretical data remanence issues. In order to ensure that all types of disks could be overwritten, he recommended using 35 patterns so that at least one of them would be right for your disk. On modern hard drives using modern data encoding techniques such as EPRML and NPML, there is no need to use multiple patterns. According to Gutmann himself:

In fact performing the full 35-pass overwrite is pointless for any drive since it targets a blend of scenarios involving all types of (normally-used) encoding technology, which covers everything back to 30+-year-old MFM methods (if you don't understand that statement, re-read the paper). If you're using a drive which uses encoding technology X, you only need to perform the passes specific to X, and you never need to perform all 35 passes.

In your position, I would recommend something along this line instead:

dd if=/dev/urandom of=/dev/sdb bs=256k oflag=direct conv=fsync

When it finishes, just make sure it has written enough bytes after it says "no space left on device".

You can also use ATA Secure Erase which initiates firmware-level data erasure. I would not use it on its own because you would be relying on the firmware authors to have implemented the standard securely. Instead, use it in addition to the above in order to make sure dd didn't miss anything (such as bad sectors and the HPA). ATA Secure Erase can be managed by the command hdparm:

hdparm --user-master u --security-set-pass yadayada /dev/sdb
hdparm --user-master u --security-erase yadayada /dev/sdb

Note that this doesn't work on all devices. Your external drive may not support it.

forest
  • 2,655
  • 13
    You can speed up dd even more by not using it. For example, instead of sudo dd if=/dev/sdb status=progress | hexdump the OP could have written sudo hexdump /dev/sdb – Chris Davies Aug 10 '22 at 08:43
  • 4
    @roaima this may or may not speed up the whole process depending on how hexdump reads (big/small blocks). – fraxinus Aug 10 '22 at 13:06
  • 1
    @fraxinus: dd simply does not speed up modern hard disk reads anymore. I found it to be only truly useful when the device is still picky about block size (such as a cdrom or (horrors) a floppy disk). Writes still seem to be a different story and dd does some good with the right block size. – Joshua Aug 11 '22 at 17:34
  • Alternatively, you could use ddrescue in place of dd. You need to install it on most systems, but it gives a much nicer live progress report (and if you have a source file of the exact required size, also gives you completion estimates), and it also defaults to a larger block size than dd (128k IIRC). – Austin Hemmelgarn Aug 11 '22 at 20:05
  • @AustinHemmelgarn On a modern dd, you can use status=progress to get a live progress report. – forest Aug 11 '22 at 21:41
  • Hard drives also can have reallocated blocks. Have a look at smartctl -a /dev/sdb to see if your drive reports reallocations. Similar to wear-leveling you usually cannot read the blocks anymore (and they are reallocated because they have a defect) but with advanced data recovery methods someone may still be able to read them. – allo Aug 12 '22 at 22:50
  • @allo Indeed. Reallocated blocks leave behind bad blocks, so even if the reallocated ones can be removed, the bad ones will stay there. – forest Aug 12 '22 at 22:58
15

/dev/zero is an infinite stream of null bytes. /dev/sdb does not contain an infinite stream of null bytes, so cmp will never report it to be identical to /dev/zero.

cmp compares the contents of the two files byte by byte until it either finds a difference or reaches the end of one of the files. If it reaches a difference, it reports something like

/dev/zero /dev/sdb differ: char 1 line 1

and exits with status 1. If cmp reaches the end of one file but not the other, it reports that the files have different sizes and exits with status 1. (With regular files, cmp checks the sizes first, and exits without comparing the contents if the sizes differ.) Only if cmp reaches the end of both files at the same time, without having found different content, does it report that the files are identical and exit with status 0 (success).

So the report from cmp does mean that /dev/sdb is all-bytes-zero.

(See other answers for additional advice regarding wiping data. But mainly, keep in mind that the need for doing multiple passes is largely a legend, which had a grain of truth in old technologies but is not at all relevant to 21st century media. A simple overwrite with zeros is just as good. And conversely shred doesn't touch reserve sectors which may be readable with some extra effort to bypass the normal working of the disk controller, so use the disk's secure erase if it works.)

10

Yes, as per forest's answer.

You might do "belt AND braces" by telling the disk's firmware to erase itself (secure erase). Depending on unknown details of the firmware, this may also render any bad blocks irretrievable.

Details of how to accomplish this with hdparm here

When it comes to SSDs this method is superior to dd because it tells the drive that all the sectors are free, rather than storing whatever data dd wrote to them. If the firmware is competently written it should also erase the spares.

This article explains how to erase a PCIe SSD. You can't use hdparm, because it's not a SATA device. Haven't tried this (yet?).

forest
  • 2,655
nigel222
  • 317
  • 2
    Actually, many modern SSDs and HDDs always use encryption, even if it is not configured explicitly. The only difference is that if you configure it explicitly, the key is protected by your passwords, whereas otherwise, the key is stored unprotected. In either case, whether explicitly or implicitly encrypted, all the firmware needs to do is overwrite the key. Barring any major breakthroughs in cryptography, that is exactly as secure, if not more, as overwriting everything. – Jörg W Mittag Aug 10 '22 at 14:58
  • 1
    @JörgWMittag It's definitely more secure as any block that was moved to the reallocated list will be unreadable as well - something you can't do with a write to /dev/sdx. – throx Aug 11 '22 at 00:11
  • @JörgWMittag Provided (and this is an issue) the key is properly overwritten (with no traces of previous values extractable with electron microscopes etc), it is much more secure than overwriting the contents (because traces of the old disk contents are no longer useful). – Martin Bonner supports Monica Aug 11 '22 at 10:15
  • 2
    Hence, "Belt AND Braces". Also known as Swiss Cheese defense. Firstly, replace everything that can be accessed without difficulty, with garbage. Then tell the disk drive to render the garbage (and anything else) inaccessible. If either one fails, the other may suffice. – nigel222 Aug 11 '22 at 10:29
  • @nigel222 ATA secure erase is not supported (it's an external disk) per https://github.com/Seagate/openSeaChest/blob/develop/docs/openSeaChest_Erase.212.txt. – NoExpert Aug 11 '22 at 20:40