I'm using du
to continuously monitor the amount of data written to USB drives that I'm duplicating.
I compare disk usage of source and target drives and display copying progress to the user.
The problem is that du
reports 100% data present on the target drive, even though I see lots of data is still in the system cache, the drive's LED is blinking, and the drives are not ready to be removed.
I run rsync
, sync
and umount
in sequence to ensure the data is really there before letting the user remove the target drive. I can't monitor the sync
progress however. So the user will see 100% long before the drives are really synced.
I'd love to be able to monitor the "real" copying progress, as it's what really matters - there's no use to see rsync
complete copying 1 GB file in 25 seconds, while I'll have to wait another 5 minutes while sync
flushes that to drive (I'm exaggerating, but you get the idea).
This is how I monitor rsync
progress in a loop for each drive:
PROGRESS="$(echo "$(du -s "/MEDIA/TARGET" 2>/dev/null | cut -f 1) / $(du -s "/MEDIA/SOURCE" 2>/dev/null | cut -f 1) " | bc -l)"
$PROGRESS
is a float between 0 and 1, indicating the ratio between source drive usage and target drive usage.
How can I modify this so it'll consider only data that is already synced to drive, and not just waiting in system cache?
Edit:
I found that dd
can perform writes omitting the system cache. I made a test and indeed copying a file this way makes du
report actual values, and my progress indications would finally be accurate:
dd if=/media/SOURCE/file of=/media/TARGET/file bs=4M oflag=direct
This uses the read cache, but disabled the write cache, making the proress easier to track, without performing excessive reads. The problem is, to use dd
instead of rsync
I need to manually recreate the directory structure. I don't need to take care of the file attributes or modification dates.
I guess I could use a combination of find
, mkdir
and dd
to first recreate the directory tree and then copy the files one by one. I wonder - if there are any downsides to this approach?
sync
progress. – NarūnasK Apr 26 '17 at 09:58atop
andiostat
give I/O activity by block device, so I always assumed this ignores internal caches and only measures "real" I/O. – dirkt Apr 26 '17 at 12:36atop
andiotop
gave physical write and read speeds - they a re still just speeds, I need to know the amount of data that has been written, I can calculate xfer speed later if I need it.For a cached write atop reports a steady write speed of around 9.7 MB/s. That makes me think it is physical write speed, not the caching speed (as I can see there's several hundreds of MB in the cache still (that the system normally considers as already written - which is not true).
– unfa Apr 27 '17 at 08:08