3

Is there any particular size limit after which the compress command works?

As I know for 0 byte file, the compress command does not work. But today I tried to compress a file of byte size 27 bytes using the compress command in UNIX, but it is not compressing. The file remains unchanged.

Michael Homer
  • 76,565
PriB
  • 479
  • 1
    If all 27 bytes are unique, can it be compressed? Further, if it can be compressed, how big would the compression metadata be? – muru Jan 16 '15 at 20:05
  • What command did you use? – Wilf Jan 16 '15 at 20:16
  • compress command – PriB Jan 16 '15 at 20:18
  • i just created one file using the folloing $ cat > test 31314124 ffsfff fsfsdf Then i tried to compress this file, but it is not doing. – PriB Jan 16 '15 at 20:24
  • I have found that sometimes the commands are now a little smarter about this and may not do the compress. I would be surprised if you did not get some type of warning or error message indicating this fact. – mdpc Jan 16 '15 at 21:58
  • 2
    Giving compress the -f option will force it to write out a compressed file even if the compressed version is larger than the original. A 0-length file compresses to 3 bytes with the version of compress on Ubuntu. – Mark Plotnick Jan 16 '15 at 22:29
  • @mdpc .. Ya..am getting error like "--File unchanged". Its not like error, but it is showing like the compress command refuses to compress the file. – PriB Jan 17 '15 at 05:43

2 Answers2

3

It depends on what is in the file (the entropy, roughly). Given completely random contents compress would actually1 make the file larger, and most implementations will refuse to do anything (the -v option will often tell you this).

A file that contains entirely zero bytes will be compressed to a minimum of 8 bytes on the implementation I have to hand. I suppose that is the size limit, below which the magic number at the start of the compressed file and the basic "repeat this N times" instruction are longer than any input.

Given more usefully-distributed contents (like text) it varies pretty dramatically, but the threshold will be somewhere well into the tens of bytes at least. I wouldn't expect a 27-byte text file to compress, and an arbitrary 27-byte binary file even less so.

1 Of course, technically, it's random, so...

Michael Homer
  • 76,565
  • Yes Michael, while am trying to compress it refuses to do anything. Gives error like (--file unchanged). But didn't get your comments properly.. what is "repeat N times" means? – PriB Jan 17 '15 at 05:38
  • Compression algorithms represent data by encoding instructions to reconstruct it that are shorter than the original data (more or less). Have a look at, say, Wikipedia on data compression. The details aren't important here. – Michael Homer Jan 17 '15 at 05:43
  • Importantly, though, compression algorithms are not magic - not everything can be made smaller. If it could, you'd just keep applying it to the output repeatedly until all your data was one byte long. – Michael Homer Jan 17 '15 at 05:46
  • yup @Michael Homer, that's true: compression algo are not magic. that's why my qs... if there any size limit known to anyone where after only compress will work? If the file size would be smaller than that size, then it should refuse. – PriB Jan 17 '15 at 05:59
  • 2
    It is data-dependent. – Michael Homer Jan 17 '15 at 06:01
  • text compresses really nicely for instance, however, many data files and executables generally do not. – mdpc Jan 17 '15 at 07:19
0

Call

compress -f your_file_to_compress

to force compress to behave predictably.

It is a very unpleasant feature - compress sometimes decides to arbitrarily leave the file unchanged, but unfortunately the side effect is that the file keeps its original name and does not change to .Z.

I would tend to say rather willfully than arbitrarily. If you call the compress from a script and then you count with the existence of the file with the extension .Z, you may be very unpleasantly surprised.