1

Given a long list of md5 checksums and copies of these files, renamed and in different folder structures: How can I recreate or recover the original filesystem structure? Assuming no hash collisions.

be70e389a9e000a85826a1a80488e1e1  path/A/2/2.bin
96a48d4706ec8eafff7e56f6784bb6b4  path/B/b1.bin
ffd2e58da118ba6c85de29c4c5b4c1f8  path/C/c1.bin
dbde0b664f88d8027e5cb7efb2cd1060  path/C/2/c2.bin
...```
Jens
  • 11
  • 1

2 Answers2

1

With bash I would:

  1. Iterate over the file with read and store each hash into an associative array
  2. Store off all local file names into a temporary file (using find should be fine).
  3. Iterate over the list of local files running md5sum on each, checking if the hash is in the array as a key, and if so renaming it to the target name.
0

I ended up using join for a quick and dirty solution, assuming that the filenames used for restoring the folders have no blank characters:

md5sum * | sort -u -k 1,1 | join - ../restore.s | \
  while read h r t; do \
    mkdir -p $(dirname "tmp/$t"); cp "$r" "tmp/$t"; \
  done

The inputs for join needed sorting and I removed identical files. The output of the join is then lines of hash source target used to restore the files.

Jens
  • 11
  • 1