24

I want to grok how fast a particular file is growing.

I could do

watch ls -l file

And deduce this information from the rate of change.

Is there something similar that would directly output the rate of growth of the file over time?

ripper234
  • 31,763

5 Answers5

27

tail -f file | pv > /dev/null

But beware that it involves acually reading the file, so it might consume a bit more resources than something that watches just file size.

gelraen
  • 6,737
16

progress (Coreutils progress viewer) or recent versions of pv can watch a file descriptor of a particular process. So you can do:

lsof your-file

to see what process ($pid) is writing to it and on which file descriptor ($fd), and do:

pv -d "$pid:$fd"

or:

progress -mp "$pid"
  • Here's a convenient bash function that uses this nice solution, just pass it the file name: `function watchgrow() { lsof $1 | grep $1 | head -n 1 | sed -r 's/ +/ /g' | cut '-d ' -f2,4 | sed -r 's/ +/:/g' | xargs pv -d } ~ – Omer Jan 31 '20 at 23:10
  • @Omer That's vastly complicated for something it has built-in :-) Just pass the filename directly: pv -s @tmp.tgz tmp.tgz > /dev/null will give you the ETA. – oligofren Nov 20 '23 at 03:40
  • @Stephane: I installed coreutils using Homebrew and there was no progress command built-in per 2023. – oligofren Nov 20 '23 at 03:42
4

The following shell function monitors a file or directory and shows an estimate of throughput / write speed. Execute with monitorio <target_file_or_directory>. If your system doesn't have du, which could be the case if you are monitoring io throughput on an embedded system, then you can use ls instead (see comment in code)

monitorio () {
# show write speed for file or directory
    interval="10"
    target="$1"
    size=$(du -ks "$target" | awk '{print $1}')
    firstrun="1"
    echo ""
    while [ 1 ]; do
        prevsize=$size
        size=$(du -ks "$target" | awk '{print $1}')
        #size=$(ls -l "$1"  | awk '{print $5/1024}')
        kb=$((${size} - ${prevsize}))
        kbmin=$((${kb}* (60/${interval}) ))
        kbhour=$((${kbmin}*60))
        # exit if this is not first loop & file size has not changed
        if [ $firstrun -ne 1 ] && [ $kb -eq 0 ]; then break; fi
        echo -e "\e[1A $target changed ${kb}KB ${kbmin}KB/min ${kbhour}KB/hour size: ${size}KB"
        firstrun=0
        sleep $interval
    done
}

example use:

user@host:~$ dd if=/dev/zero of=/tmp/zero bs=1 count=50000000 &
user@host:~$ monitorio /tmp/zero
/tmp/zero changed 4KB 24KB/min 1440KB/hour size: 4164KB
/tmp/zero changed 9168KB 55008KB/min 3300480KB/hour size: 13332KB
/tmp/zero changed 9276KB 55656KB/min 3339360KB/hour size: 22608KB
/tmp/zero changed 8856KB 53136KB/min 3188160KB/hour size: 31464KB
^C
user@host:~$ killall dd; rm /tmp/zero
gesell
  • 151
  • 1
    Thanks this worked great! I made a few small modifications if anyone is interested. My file transfer was spotty so I turned off stopping the script when the file size doesn't change, also added an optional second parameter to set the interval, and no longer printing the text on the first run since it's always 0: https://gist.github.com/einsteinx2/14a0e865295cf66aa9a9bf1a8e46ee49 – Ben Baron Aug 24 '18 at 16:55
  • 1
    Thanks! I had to use the commented out ls version on Mac. – Ehren Sep 08 '21 at 17:14
  • I also had to do set -k to get it to ignore the # lines on Mac/zsh. – hippietrail Aug 02 '22 at 08:02
4

I have a little perl script that I put in my bash environment as a function:

fileSizeChange <file> [seconds]

Sleep seconds defaults to 1.

fileSizeChange() {
  perl -e '
  $file = shift; die "no file [$file]" unless -f $file; 
  $sleep = shift; $sleep = 1 unless $sleep =~ /^[0-9]+$/;
  $format = "%0.2f %0.2f\n";
  while(1){
    $size = ((stat($file))[7]);
    $change = $size - $lastsize;
    printf $format, $size/1024/1024, $change/1024/1024/$sleep;
    sleep $sleep;
    $lastsize = $size;
  }' "$1" "$2"
}
Matt
  • 8,991
2
tail -f -c 1 file | pv > /dev/null

A variation on the other answer, the -c 1 means start from the last byte of the file, which avoids having to read in the last ten lines first (which can take a while on binary files).

mhansen
  • 123