Some file copying programs like rsync
and curl
have the ability to resume failed transfers/copies.
Noting that there can be many causes of these failures, in some cases the program can do "cleanup" some cases the program can't.
When these programs resume, they seem to just calculate the size of the file/data that was transferred successfully and just start reading the next byte from the source and appending on to the file fragment.
e.g the size of the file fragment that "made it" to the destination is 1378 bytes, so they just start reading from byte 1379 on the original and adding to the fragment.
My question is, knowing that bytes are made up of bits and not all files have their data segmented in clean byte sized chunks, how do these programs know they the point they have chosen to start adding data to is correct?
When writing the destination file is some kind of buffering or "transactions" similar to SQL databases occurring, either at the program, kernel or filesystem level to ensure that only clean, well formed bytes make it to the underlying block device?
Or do the programs assume the latest byte would be potentially incomplete, so they delete it on the assumption its bad, recopy the byte and start the appending from there?
knowing that not all data is represented as bytes, these guesses seem incorrect.
When these programs "resume" how do they know they are starting at the right place?
fopen()
,fwrite()
operation would give access to the file system and the process could just stream bits into it. – the_velour_fog Feb 06 '18 at 06:41head -c 20480 /dev/zero | strace -e write tee foo >/dev/null
, and then the OS will buffer them up and send them to the disk in even larger chunks. – muru Feb 06 '18 at 06:46fwrite()
? – psmears Feb 06 '18 at 10:41