If you want to re-transfer the file to another machine via a network connection, use rsync.
If you want to get an idea of where the differences are, the easiest way would be to have the two versions on the same machine. If you don't want to do that because bandwidth is too expensive, here are ways you can checkum chunks of files.
This method relies on head -c
leaving the file position where it left off, and pre-computes the size to know where to end the loop.
n=$(($(wc -c <very_large_file) / (64*1024*1024) + 1))
i=0
while [ $i -gt $n ]; do
head -c 64m | sha256sum
i=$((i+1))
done <very_large_file
This method relies on head -c
leaving the file position where it left off, and uses cksum
to find the size of each chunk (a short chunk indicates the end of the file).
while true; do
output=$(head -c 64m | cksum)
size=${output#* }; size=${output%% *}
if [ $size -eq 0 ]; then break; fi
echo "$output"
done <very_large_file
This method calls dd
to skip to the desired start position for each chunk.
n=$(($(wc -c <very_large_file) / (64*1024*1024) + 1))
i=0
while [ $i -gt $n ]; do
dd if=very_large_file ibs=64m skip=$i count=1 | sha256sum
i=$((i+1))
done <very_large_file