30

I have a machine with 90% hard-disk usage. I want to compress its 500+ log files into a smaller new file. However, the hard disk is too small to keep both the original files and the compressed ones.

So what I need is to compress all log files into a single new file one by one, deleting each original once compressed.

How can I do that in Linux?

dhag
  • 15,736
  • 4
  • 55
  • 65
Zen
  • 7,537

5 Answers5

26

gzip or bzip2 will compress the file and remove the non-compressed one automatically (this is their default behaviour).

However, keep in mind that while the compressing process, both files will exists.

If you want to compress log files (ie: files containing text), you may prefer bzip2, since it has a better ratio for text files.

bzip2 -9 myfile       # will produce myfile.bz2

Comparison and examples:

$ ls -l myfile
-rw-rw-r-- 1 apaul apaul 585999 29 april 10:09 myfile

$ bzip2 -9 myfile

$ ls -l myfile*
-rw-rw-r-- 1 apaul apaul 115780 29 april 10:09 myfile.bz2

$ bunzip2 myfile.bz2

$ gzip -9 myfile

$ ls -l myfile*
-rw-rw-r-- 1 apaul apaul 146234 29 april 10:09 myfile.gz

UPDATE as @Jjoao told me in a comment, interestingly, xz seems to have a best ratio on plain files with its default options:

$ xz -9 myfile

$ ls -l myfile*
-rw-rw-r-- 1 apaul apaul 109384 29 april 10:09 myfile.xz

For more informations, here is an interesting benchmark for different tools: http://binfalse.de/2011/04/04/comparison-of-compression/

For the example above, I use -9 for a best compression ratio, but if the time needed to compress data is more important than the ratio, you'd better not use it (use a lower option, ie -1, or something between).

apaul
  • 3,378
  • 2
    +1; Just curious: could you add a xz myfile ? – JJoao Apr 29 '15 at 08:29
  • 2
    @JJoao thanks! It's interesting, I'm not used to use xz, but I'll consider it now. See the update of my post. – apaul Apr 29 '15 at 11:56
  • 4
    Please don't do xz -9. It greatly increases the memory required for compression/decompression, without significantly improving the compression ratio. The manpage even says (emphasis theirs) "Specifically, it's not a good idea to blindly use -9 for everything like it often is with gzip(1) and bzip2(1)". The default xz -6 is good enough, and even xz -0/xz -1 usually compress better than gzip -9. – user49740 Apr 29 '15 at 12:27
  • @user49740 you're right. I rarely use -9, but I used it here since I wanted to make some kind of benchmark for compression ratio "on the same scale". But once again, you're totally right: it's a bad idea to blindly use -9. – apaul May 06 '15 at 20:57
26

I figured out a tar solution by myself.
It deletes single file after compressed it into the target file.
The compressing speed is not quite fast, though. The command looks like:

tar -zcvf my_log.tar.gz *.log --remove-files
Zen
  • 7,537
3

when you use io redirection in bash with >, the original file will be empty before write new data.

there is a command dd that can overwrite some content of the file instead of empty the file before writing, so following command may work:

gzip -c some-file | dd conv=notrunc of=some-file

mostly, compressed data are smaller than original data. when gzip read first N bytes, it only output M bytes where M < N, so one can safely overwrites first M bytes of original file with compressed data, and leave data after first N bytes not changed.

but there will be data after the end of gzip.

however, if dd write faster than gzip, i do not know what will happen.


or you can map a file to a block device by losetup. for block device, writing operation will not empty the original data.

loop_device=$(losetup -f--show some-file)
gzip -c $loop_device > $loop_device
gholk
  • 71
2

In complement to @apaul, I emphasize that compressing files individually

 bzip2 *.log.*

(replace bzip2 by gzip, xz, or what ever your favorite file zip is) may be important:

This way you can still see (bzcat file.bz2), search (bzgrep file.bz2), edit (vi file.bz2) the compressed file and remove the older ones when necessary.

JJoao
  • 12,170
  • 1
  • 23
  • 45
1

I was trying to do this on the BSD-version of tar. In this case, the --remove-files option is not available. What I ended up doing (and worked) was:

find folder_to_tar -type f -exec tar --append --file=output_tar_file.tar {} \; -exec rm -v {} \;
pgilmon
  • 111