How to compress all files in folder and erase the untared versions via command line?

Question

I have a backup disk containing uncompressed versions of my files that I would like to compress to save space on this disk. The disk (containing files to compress) is full (about 3TB with 17GB free). Is there a command or script that I could run to go through each file in the disk and compress it while deleting the uncompressed versions. For example if I have folders like f1 f2 f3 I would like to end up with f1.tar.gz f2.tar.gz f3.tar.gz only, with original uncompressed folders erased so I have more space in the drive? Also, will it work for sub-folders as well?

Edit: I was hoping to compress the files as much as possible without running the risk of corruption. There are about 150 parent directories and most have 10 to 50 sub-folders, some of which contain already compressed files. As a simplified example my file structure looks something like:

Parent folders:
parent1 parent2 parent3

each of which might have something like:

child1 child2 child3 file1.zip

and each might have

file1 file2.zip file2.tar.gz file3

of course some parent folders just contain files without subfolders. What I would like to do is to run a script from bash or use tar to put in a command to compress the parent folders to look like:

parent1.tar.gz parent2.tar.gz parent3.tar.gz

while erasing the uncompressed parent folders, so the first thing the command would compress the parent1.tar.gz (assuming I have enough space for the tarball file on the same harddrive) and then it would erase it so the file structure would look like

parent1.tar.gz parent2 parent3,

then the command would proceed to do the same thing for parent2 and parent3 ending up with the final file structure of:

parent1.tar.gz parent2.tar.gz parent3.tar.gz

and if possible, it would be nice to know how to extend this operation to the subfolders, but if that is too complex or wouldn't save much additional space, it is not necessary. It would be also nice to know a way to sort the files in order of space taken up by each, to see how many files would be required to compress individually, before enough space becomes available in order to automate the process, but this is also not required.

Do you want it to work for subfolders? Or would you rather compress the parent directories? What if you just compress each individual file instead? Please [edit] your question and show us an example directory/file structure and what you would like to do with it. — terdon, Jun 19 '16 at 16:06

score 0 · Accepted Answer · edited Apr 13 '17 at 12:36

Someone may have a script (or write one). I do this sort of thing either using zip, or a script which uses gzip or bzip2. Both have provisions for removing the files after the archive is complete.

The issues are

you can certainly write compressed archives for each directory
until the archive is complete, you cannot remove the directory
you cannot update a compressed archive, e.g., by adding files to it.

So you have to have enough free space to create the first archive, and gradually get enough space to write larger archives. Presumably you do not have a lot of compressed files (such as png, jpg, pdf). Otherwise you will not gain space.

If it were not for the strong likelihood of running short of space, the problem would be simply a script loop over the directories to be compressed/removed. However, with 17Gb freespace of 3Tb, it's likely that some directories have gotten rather large. You will have to do some analysis to see if this interferes with the simple solution outlined.

How to compress all files in folder and erase the untared versions via command line?

1 Answers1