16

I want to backup 1 terabyte of data to an external disk.

I am using this command: tar cf /media/MYDISK/backup.tar mydata

PROBLEM: My poor laptop freezes and crashes whenever I use 100% CPU or 100% disk (if you want to react about this please write here). So I want to stay at around 50% CPU and 50% disk max.

My question: How to throttle CPU and disk with the tar command?

Rsync has a --bwlimit option, but I want an archive because 1) there are many small files 2) I prefer to manage a single file rather a tree. That's why I use tar.

4 Answers4

26

You can use pv to throttle the bandwidth of a pipe. Since your use case is strongly IO-bound, the added CPU overhead of going through a pipe shouldn't be noticeable, and you don't need to do any CPU throttling.

tar cf - mydata | pv -L 1m >/media/MYDISK/backup.tar
  • 1
    Even better than what I was expecting! I am limiting IO to 5mb/second and the CPU stays at 13% (seems reasonable with encryption going on). – Nicolas Raoul Jun 01 '12 at 06:01
  • 1
    Very nice!. Maybe also recommend the -q flag to eliminate the progress bar. – szabgab May 22 '15 at 04:12
  • such a great command!!! If you don't have it in your Debian/Ubuntu, just run apt-get install pv.
    I recommend adding -b after -L 1m: this way an incremental byte counter will be printed, updated every second. In this way you know how many bytes have been written until now
    – lucaferrario Jul 24 '17 at 17:15
  • Also, please note that if you compress output (using tar czf instead of tar cf) the limit refers to the compressed output (so with -L 1m your disk write will not exceed 1 MB/s....but your disk read will be probably between 1 MB/s and 5 MB/s, depending on your content type) – lucaferrario Jul 24 '17 at 17:29
  • 1
    Great trick ! When compressing with xz, I pipe through pv first then through xz. – Jean-Bernard Jansen Dec 11 '17 at 11:08
8

You can try the cpulimit tool which does limit the CPU percentage. It is not a standard tool, so you will have to install it. Here is a quick excerpt of the README:

"Cpulimit is a tool which attempts to limit the CPU usage of a process (expressed in percentage, not in CPU time). [...] The control of the used cpu amount is done sending SIGSTOP and SIGCONT POSIX signals to processes. All the children processes and threads of the specified process will share the same percent of CPU."

Then I would recommend ionice for limiting the IO usage, though it is the concurrent access that would be limited, not the maximum throughput... Never-the-less here is how to use it:

ionice -c 3 <your_command>
Huygens
  • 9,345
3

You can't really get a process to run less. You can use nice to give it a lower priority, but that's in relation to other processes. The way to run the CPU cooler while a process runs is to use usleep(3) to force the process out of the run state a certain amount of time, but that would involve either patching tar or using the LD_PRELOAD mechanism to provide a patched function that tar uses a lot (e.g. fopen(3)).

I suspect your best workarounds are the hardware ones you've mentioned on SuperUser: keeping the laptop cool and/or lowering the CPU clock.

An annoying but possibly viable workaround (a kludge, really) works at a ‘macroscopic’ level. Rather than making tar run 100ms every 200ms, you could make it run one second out of every two. Note: this is a horrible, horrible kludge. But hey, it might even work!

tar cjf some-file.tar.bz2 /some-directory &
while true; do
    sleep 1  # Let it run for a second
    kill -STOP $! 2>/dev/null || break
    sleep 1  # Pause it for a second
    kill -CONT $! 2>/dev/null || break
done

The first sleep adjusts sleep time, the second one adjusts runtime. As it stands now, it's got a 50% duty cycle. To keep the temperature down, you will very likely need to reduce the duty cycle to perhaps 25% or lower (1 second run, 3 seconds sleep = 1 of every 4 seconds = 25% duty cycle). The shell command sleep can take fractional times, by the way. So you could even say sleep 0.1. Keep it over 0.001 just to be sure, and don't forget that script execution adds to the time too.

Alexios
  • 19,157
0

A more general purpose way of limiting the CPU is to use /sys. This seems like what you want anyway since things other than tar are capable of performing computationally expensive tasks, its just that youre seeing tar do it the most.

The way to do this is to:

  1. go to /sys/devices/system/cpu/cpuX/cpufreq for each of your CPUs (replace cpuX with each cpu).
  2. Then look in the file scaling_available_frequencies to see what frequencies your CPU supports.
  3. Pick a frequency (lets say 1234567) and do echo 1234567 > scaling_max_freq

This will prevent the CPU from ever going above the specified frequency.

phemmer
  • 71,831
  • Thanks, but I am already at the lowest possible frequency. I should have mentioned that. I started underclocking as soon as these problems began appearing. – Nicolas Raoul Jun 03 '12 at 09:13