16

I would like to create a gzipped file that retains the original file name. For example gzipping "example.txt" should output a gzipped file named "example.txt" rather than "example.txt.gz." Is it possible to do this elegantly with one command (not doing a subsequent mv)?

jamieb
  • 261
  • 4
    I am a bit curious. Why do you want this? It sounds like a bad idea. – Bernhard Mar 21 '13 at 18:41
  • 3
    Yeah. You put 2 whole lines in a bash script and call it "my-elegant-command". ;) – goldilocks Mar 21 '13 at 18:45
  • 2
    @Bernhard It's part of a continuous integration build process for a web app. Static assets (CSS, JS files) need to be compressed without changing the file name. When delivered to the browser a "content-encoding: gzip" header is included so the extension is irrelevant. But if the filename is changed, I have to do a search-and-replace in the source HTML files. – jamieb Mar 21 '13 at 18:45
  • If this is really that much of an issue for you, you could define a bash function that passes $* to the gzip executable and the second line does the mv for you. – Bratchley Mar 21 '13 at 19:08
  • 4
    @your web app problem: any decent webserver can/will do the compressing for you ... – Bananguin Mar 21 '13 at 21:36
  • @Bananguin Yeah, except if you're hosting static files from AWS-S3, then you need to gzip before uploading. – nathan-m Jun 03 '14 at 07:37

6 Answers6

11

I had the same issue, as part of a CI deploy to AWS S3.

This is what I did for recursively gzipping a directory (in place) without the .gz suffix:

find . -type f -exec gzip "{}" \; -exec mv "{}.gz" "{}" \;

Seems clean enough for me. But yeah looks like you need a mv in there somewhere.

If you're using grunt you could look at grunt-contrib-compress. Some of the grunt tools specifically for deploying to S3 will handle gzip for you, too.

tobek
  • 388
  • 4
  • 11
  • 1
    should be find . -type ... not find. add the space please :) – Humdinger Sep 11 '15 at 17:23
  • If I compress them with this and use the s3 sync command in the aws cli, then it doesn't maintain the gzip content type, and returns the raw mangled gzip content. – Cerin Mar 29 '21 at 02:39
11

This does NOT work:

# echo Hello World > example.txt
# gzip < example.txt > example.txt # WRONG!
# file example.txt
example.txt: gzip compressed data, from Unix, last modified: Thu Mar 21 19:45:29 2013
# gunzip < example.txt
<empty file>

This is a race condition:

# echo Hello World > example.txt
# dd if=example.txt | gzip | dd of=example.txt # still WRONG!
# gunzip < example.txt 
Hello World # may also be empty

The problem is that the > example.txt (or dd of=example.txt for that matter) kills the file before the other process has the chance to read it. So there is no obvious solution, which is why you should stick to mv.

There are a number of ways you could cheat. You can open the file, then unlink it - the file will continue to exist until you close it - and then create a new file with the same name and write the gzipped data to that. However I do not know an obvious way to coerce bash to use that, and even if I did, my answer would still be:

Don't even do it.

If gzip fails for any reason, or any problem occurs, like you running out of space while gzipping (because other processes are writing, or gzip result is larger than the input - which happens for random data - etc.), you just lost your file. Congratulations!

Create a separate file and mv on success. That's the simplest, easy to understand, and most reliable method you will ever find.

frostschutz
  • 48,978
  • 1
    How about adding for the sake of completeness: gzip example.txt && mv example.txt.gz example.txt – depquid Mar 21 '13 at 19:07
  • 2
    No depquid read the OP -- that's inelegant. – goldilocks Mar 21 '13 at 19:08
  • @goldilocks "Create a separate file and mv on success." can be made more elegant? I was just trying to propose that frostschutz's answer be augmented with a specific example. If mv can be used more elegantly than I thought, please give an example. – depquid Mar 21 '13 at 21:39
  • Your suggestion is the simple, elegant, obvious approach, but whether it works depends on so many variables, e.g. what do you do if there already is a example.txt.gz? Also with no extension to work with, you have to prevent gzipping already gzipped files somehow. That's a whole new can of worms, but that was not really part of the question. – frostschutz Mar 21 '13 at 21:49
3

-S extension You Want

gzip -S "`_date +%Y_%M' dog.txt 

will result in dog.txt_2015_11

when you unzip it you must specify the extension.

gzip -d _2015_11 dog.txt_2015_11

In unix use the file command to determine what type of file you have, extensions are misleading, or missing often.

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
1

I don't think creating a gzip file with no extension is really the proper thing to do.

IMHo you should configure your web server to read the .gz file. You probably have already a rule like this:

Path asets/:
  If header Accept-Encoding contains "gzip" and not contains "gzip;q=0":
    Add header Content-Encoding: gzip

You just need to add a rule rewriting the requested filename to append ".gz" (actually, you should check that the file exists, just like you should verify that the client did list gzip on its Accept-Encoding header)

Ángel
  • 3,589
1

You can try s3_website for this.

I don't like the fact it's written in both scala and ruby and that it needs a JVM. Also I don't like the assumption it makes(especially the fact that it deletes extra files from the bucket) but it should work if you're fine with that.

I'm planning to write such a tool on my own that doesn't have these limitations, stay tuned.

0

This isn't really something that you should be doing, mainly because when transferring this file to other systems or people, it might end up being confusing for them and not finding it as a compressed file.

If you don't want to use any suffix, then GNU isn't good for you, as gzip -S "" would return a gzip: invalid suffix ''.

However, you could always send something like gzip -S " " (blankspace), and it will be shown like this:

$ file testfile\  
testfile: gzip compressed data, was "testfile", from Unix, last modified: Tue Jun  3 XX:XX:XX 2014

Afterwards, if you want to decompress it, you'll have to do something like gunzip -c testfile\ (without specifying the suffix), or even with -f flag.

I sincererly think that adding a mv command with && wouldn't make that much of a hassle to your code. Anyway, and as @frostschutz has said, it isn't a really good idea to do this.

  • This is something that's needed if you want to use S3 for serving compressed files, like for hosting a static website. You might consider this: https://github.com/laurilehmijoki/s3_website/ – Cristian Măgherușan-Stanciu Jul 28 '14 at 08:37